Compositions for use in identification of papillomavirus

ABSTRACT

The present invention relates generally to identification of HPV, and provides methods, compositions and kits useful for this purpose when combined, for example, with molecular mass or base composition analysis.

CROSS REFERENCE TO RELATED PATENT APPLICATIONS

The present application claims the benefit of priority to U.S.Provisional Application No. 61/102,324, filed Oct. 2, 2008 and isfurther a continuation-in-part of U.S. patent application Ser. No.11/368,233, filed Mar. 3, 2006, which claims the benefit of priority toU.S. Provisional Application Nos. 60/658,248, filed Mar. 3, 2005,60/705,631, filed Aug. 3, 2005, 60/732,539, filed Nov. 1, 2005, and60/740,617, filed Nov. 28, 2005, which are each incorporated byreference in their entirety.

STATEMENT OF GOVERNMENT SUPPORT

This invention was made with United States government support underNIH/NIAID contract No. N01 A140100 awarded by the National Institutes ofHealth. The United States government has certain rights in theinvention.

FIELD OF THE INVENTION

The present invention relates generally to identification of Humanpapillomavirus (HPV), and provides methods, compositions and kits usefulfor this purpose when combined, for example, with molecular mass or basecomposition analysis.

BACKGROUND OF THE INVENTION

Papillomaviruses are a diverse group of DNA-based viruses that infectthe skin and mucous membranes of humans and a variety of animals.Approximately 130 HPV types have been identified. About 30-40 HPV typesare typically transmitted through sexual contact and infect theanogenital region. Different HPV types are associated with differentpathological risks. Some HPV types result in latent infection, whilesome can cause warts, and others may cause a subclinical infectionresulting in precancerous lesions. Persistent infection with a“high-risk” subset of sexually transmitted HPV may lead to potentiallyprecancerous lesions and can progress to invasive cancer. HPV infectionis a necessary factor in the development of nearly all cases of cervicalcancer.

SUMMARY OF THE INVENTION

The present invention relates generally to detection and identificationof HPV, and provides methods, compositions and kits useful for thispurpose when combined, for example, with molecular mass or basecomposition analysis. However, the compositions and methods find use ina variety of biological sample analysis techniques and are not limitedto processes that employ or require molecular mass or base compositionanalysis. For example, primers described herein find use in a variety ofresearch, surveillance, and diagnostic approaches that utilize one ormore primers, including a variety of approaches that employ thepolymerase chain reaction.

Ito further illustrate, in certain embodiments, the invention for therapid detection and characterization of papillomavirus. In someembodiments the present invention provides a composition comprising atleast one purified oligonucleotide primer pair that comprises forwardand reverse primers, wherein said primer pair comprises nucleic acidsequences that are substantially complementary to nucleic acid sequencesof two or more different bioagents belonging to the Papillomaviridaefamily, wherein the primer pair is configured to produce ampliconscomprising different base compositions that correspond to the two ormore different bioagents. In addition to compositions and kits thatinclude one or more of the primer pairs described herein, the inventionalso relates to methods and systems.

In one aspect, the present invention provides compositions comprising atleast one purified oligonucleotide primer pair that comprises forwardand reverse primers about 15 to 35 nucleobases in length, wherein theforward primer comprises at least 70% identity (e.g., 70% . . . 75% . .. 90% . . . 95% . . . 100%) with a sequence selected from SEQ ID NOs:1-8 and 17-43, and wherein the reverse primer comprises at least 70%identity (e.g., 70% . . . 75% . . . 90% . . . 95% . . . 100%) with asequence selected from SEQ ID NOs: 9-16 and 44-70. Typically, the primerpair is configured to hybridize with HPV nucleic acids. In furtherembodiments, the primer pair is selected from the group of primer pairsequences consisting of: SEQ ID NOS: 1:9, 2:10, 3:11, 4:12, 5:13, 6:14,7:15, 8:16, 17:44, 18:45, 19:46, 20:47, 21:48, 22:49, 23:50, 24:51,25:52, 26:53, 27:54, 28:55, 29:56, 30:57, 31:58, 32:59, 33:60, 34:61,35:62, 36:63, 37:64, 38:65, 39:66, 40:67, 41:68, 42:69, and 43:70. Incertain embodiments, the forward and/or reverse primer has a base lengthselected from the group consisting of: 15, 16, 17, 18, 19, 20, 21, 22,23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, or 34 nucleotides,although both shorter and longer primers may be used.

In another aspect, the invention provides a purified oligonucleotideprimer pair, comprising a forward primer and a reverse primer that eachindependently comprise 14 to 40 consecutive nucleobases selected fromthe primer pair sequences shown in Table 1 and/or Table 2, which primerpair is configured to generate an amplicon between about 50 and 150consecutive nucleobases in length.

In another aspect, the invention provides a kit comprising at least onepurified oligonucleotide primer pair that comprises forward and reverseprimers that are about 20 to 35 nucleobases in length, and wherein theforward primer comprises at least 70%, at least 80%, at least 90%, atleast 95%, or at least 100% sequence identity with a sequence selectedfrom the group consisting of SEQ ID NOS: 1-8 and 17-43, and the reverseprimer comprises at least 70% sequence identity (e.g., 75%, 85%, or 95%)with a sequence selected from the group consisting of SEQ ID NOS: 9-16and 44-70. In some embodiments, the kit comprises a primer pair that isa broad range survey primer pair (e.g., specific for nucleic acid of ahousekeeping gene found in many or all members of a category oforganism).

In other embodiments, the amplicons produced with the primers are 45 to200 nucleobases in length (e.g., 45 . . . 75 . . . 125 . . . 175 . . .200). In some embodiments, a non-templated T residue on the 5′-end ofsaid forward and/or reverse primer is removed. In still otherembodiments, the forward and/or reverse primer further comprises anon-templated T residue on the 5′-end. In additional embodiments, theforward and/or reverse primer comprises at least one molecular massmodifying tag. In some embodiments, the forward and/or reverse primercomprises at least one modified nucleobase. In further embodiments, themodified nucleobase is 5-propynyluracil or 5-propynylcytosine. In otherembodiments, the modified nucleobase is a mass modified nucleobase. Instill other embodiments, the mass modified nucleobase is 5-Iodo-C. Inadditional embodiments, the modified nucleobase is a universalnucleobase. In some embodiments, the universal nucleobase is inosine. Incertain embodiments, kits comprise the compositions described herein.

In particular embodiments, the present invention provides methods ofdetermining a presence of an HPV in at least one sample, the methodcomprising: (a) amplifying one or more (e.g., two or more, three ormore, four or more, etc.; one to two, one to three, one to four, etc.;two, three, four, etc.) segments of at least one nucleic acid from thesample using at least one purified oligonucleotide primer pair thatcomprises forward and reverse primers that are about 20 to 35nucleobases in length, and wherein the forward primer comprises at least70% (e.g., 70% . . . 75% . . . 90% . . . 95% . . . 100%) sequenceidentity with a sequence selected from the group consisting of SEQ IDNOs: 1-8 and 17-43, and the reverse primer comprises at least 70% (e.g.,70% . . . 75% . . . 90% . . . 95% . . . 100%) sequence identity with asequence selected from the group consisting of SEQ ID NOs: 9-16 and44-70 to produce at least one amplification product; and (b) detectingthe amplification product, thereby determining the presence of the HPVin the sample.

In certain embodiments, step (b) comprises determining an amount of theHPV in the sample. In further embodiments, step (b) comprises detectinga molecular mass of the amplification product. In other embodiments,step (b) comprises determining a base composition of the amplificationproduct, wherein the base composition identifies the number of Aresidues, C residues, T residues, G residues, U residues, analogsthereof and/or mass tag residues thereof in the amplification product,whereby the base composition indicates the presence of the HPV in thesample or identifies the pathogenicity of the HPV in the sample. Inparticular embodiments, the methods further comprise comparing the basecomposition of the amplification product to calculated or measured basecompositions of amplification products of one or more known HPV presentin a database, for example, with the proviso that sequencing of theamplification product is not used to indicate the presence of or toidentify the HPV, wherein a match between the determined basecomposition and the calculated or measured base composition in thedatabase indicates the presence of or identifies the HPV. In someembodiments, the identification of HPV is at the biological kingdomlevel, phylum level, class level, order level, family level, genuslevel, species level, sub-type level (e.g., stain level), genotypelevel, or individual identity level.

In some embodiments, the present invention provides methods ofidentifying one or more HPV bioagents in a sample, the methodcomprising: amplifying two or more segments of a nucleic acid from theone or more HPV bioagents in the sample with two or more oligonucleotideprimer pairs to obtain two or more amplification products (e.g., from asingle bioagent); (b) determining two or more molecular masses and/orbase compositions of the two or more amplification products; and (c)comparing the two or more molecular masses and/or the base compositionsof the two or more amplification products with known molecular massesand/or known base compositions of amplification products of known HPVbioagents produced with the two or more primer pairs to identify the oneor more HPV bioagents in the sample. In certain embodiments, the methodscomprise identifying the one or more HPV bioagents in the sample usingthree, four, five, six, seven, eight or more primer pairs. In otherembodiments, the one or more HPV bioagents in the sample cannot beidentified using a single primer pair of the two or more primer pairs.In particular embodiments, the methods comprise obtaining the two ormore molecular masses of the two or more amplification products via massspectrometry. In certain embodiments, the methods comprise calculatingthe two or more base compositions from the two or more molecular massesof the two or more amplification products. In some embodiments, the HPVbioagents are selected from the group consisting of a HPV genus, aspecies thereof, a sub-species thereof, and combinations thereof.

In some embodiments, the present invention provides methods ofidentifying one or more strains of HPV in a sample, the methodcomprising: (a) amplifying two or more segments of a nucleic acid fromthe one or more HPV in the sample with first and second oligonucleotideprimer pairs to obtain two or more amplification products, wherein thefirst primer pair is a broad range survey primer pair, and wherein thesecond primer pair produces an amplicon that reveals species, sub-type,strain, or genotype-specific information; (b) determining two or moremolecular masses and/or base compositions of the two or moreamplification products; and (c) comparing the two or more molecularmasses and/or the base compositions of the two or more amplificationproducts with known molecular masses and/or known base compositions ofamplification products of known HPV produced with the first and secondprimer pairs to identify the HPV in the sample. In some embodiments, thesecond primer pair amplifies a portion of a gene from HPV.

In certain embodiments, the second primer pair comprises forward andreverse primers that are about 20 to 35 nucleobases in length, andwherein the forward primer comprises at least 70% sequence identity witha sequence selected from the group consisting of SEQ ID NOs: 1-8 and17-43, and the reverse primer comprises at least 70% sequence identitywith a sequence selected from the group consisting of SEQ ID NOs: 9-16and 44-70 to produce at least one amplification product. In furtherembodiments, the obtaining the two or more molecular masses of the twoor more amplification products is via mass spectrometry. In someembodiments, the methods comprise calculating the two or more basecompositions from the two or more molecular masses of the two or moreamplification products. In further embodiments, the HPV is selected fromthe group consisting of: the family Papillomaviridae, the genusAlphapapillomavirus, the genus Betapapillomavirus, the genusGammapapillomavirus, the genus Mupapillomavirus, and the genusNupapillomavirus.

In some embodiments, the second primer pair is selected from the groupof primer pair sequences consisting of: SEQ ID NOS: 1:9, 2:10, 3:11,4:12, 5:13, 6:14, 7:15, 8:16, 17:44, 18:45, 19:46, 20:47, 21:48, 22:49,23:50, 24:51, 25:52, 26:53, 27:54, 28:55, 29:56, 30:57, 31:58, 32:59,33:60, 34:61, 35:62, 36:63, 37:64, 38:65, 39:66, 40:67, 41:68, 42:69,and 43:70. In other embodiments, the determining the two or moremolecular masses and/or base compositions is conducted withoutsequencing the two or more amplification products. In certainembodiments, the HPV in the sample cannot be identified using a singleprimer pair of the first and second primer pairs. In other embodiments,the HPV in the sample is identified by comparing three or more molecularmasses and/or base compositions of three or more amplification productswith a database of known molecular masses and/or known base compositionsof amplification products of known HPV produced with the first andsecond primer pairs, and a third primer pair.

In further embodiments, members of the first and second primer pairshybridize to conserved regions of the nucleic acid that flank a variableregion. In some embodiments, the variable region varies between at leasttwo species of HPV. In particular embodiments, the variable regionuniquely varies between at least two (e.g., 3, 4, 5, 6, 7, 8, 9, 10, . .. , 20, etc.) genuses, species, sub-types, strains, or genotypes of HPV.

In some embodiments, the present invention provides systems comprising:(a) a mass spectrometer configured to detect one or more molecularmasses of amplicons produced using at least one purified oligonucleotideprimer pair that comprises forward and reverse primers about 15 to 35nucleobases in length, wherein the forward primer comprises at least 70%(e.g., 70% . . . 75% . . . 90% . . . 95% . . . 100%) identity with asequence selected from SEQ ID NOs: 1-8 and 17-43, and wherein thereverse primer comprises at least 70% (e.g., 70% . . . 75% . . . 90% . .. 95% . . . 100%) identity with a sequence selected from SEQ ID NOs:9-16 and 44-70; and (b) a controller operably connected to the massspectrometer, the controller configured to correlate the molecularmasses of the amplicons with one or more species of HPV identities. Incertain embodiments, the second primer pair is selected from the groupof primer pair sequences consisting of: SEQ ID NOS: 1:9, 2:10, 3:11,4:12, 5:13, 6:14, 7:15, 8:16, 17:44, 18:45, 19:46, 20:47, 21:48, 22:49,23:50, 24:51, 25:52, 26:53, 27:54, 28:55, 29:56, 30:57, 31:58, 32:59,33:60, 34:61, 35:62, 36:63, 37:64, 38:65, 39:66, 40:67, 41:68, 42:69,and 43:70. In other embodiments, the controller is configured todetermine base compositions of the amplicons from the molecular massesof the amplicons, which base compositions correspond to the one or morespecies of HPV. In particular embodiments, the controller comprises oris operably connected to a database of known molecular masses and/orknown base compositions of amplicons of known species of HPV producedwith the primer pair.

In certain embodiments, the database comprises molecular massinformation for at least three different bioagents. In otherembodiments, the database comprises molecular mass information for atleast 2 . . . 10 . . . 50 . . . 100 . . . 1000 . . . 10,000, or 100,000different bioagents. In particular embodiments, the molecular massinformation comprises base composition data. In some embodiments, thebase composition data comprises at least 10 . . . 50 . . . 100 . . . 500. . . 1000 . . . 1000 . . . 10,000 . . . or 100,000 unique basecompositions. In other embodiments, the database comprises molecularmass information for a bioagent from two or more genuses selected fromthe group consisting of, but not limited to alphapapillomavirus,betapapillomavirus, gammapapillomavirus. mupapillomavirus, andnupapillomavirus. In some embodiments, the database comprises molecularmass information for a bioagent from each of the genusesalphapapillomavirus, betapapillomavirus, gammapapillomavirus,mupapillomavirus, nupapillomavirus. In further embodiments, the databasecomprises molecular mass information for a HPV bioagent. In furtherembodiments, the database is stored on a local computer. In particularembodiments, the database is accessed from a remote computer over anetwork. In further embodiments, the molecular mass in the database isassociated with bioagent identity. In certain embodiments, the molecularmass in the database is associated with bioagent geographic origin. Inparticular embodiments, bioagent identification comprises interrogationof the database with two or more different molecular masses (e.g., 2, 3,4, 5, . . . 10 . . . 25 or more molecular masses) associated with thebioagent.

In some embodiments, the present invention provides compositionscomprising at least one purified oligonucleotide primer 15 to 35nucleobases in length, wherein the oligonucleotide primer comprises atleast 70% (e.g., 70% . . . 75% . . . 90% . . . 95% . . . 100%) identitywith a sequence selected from SEQ ID NOs: 1-16 and 17-70.

BRIEF DESCRIPTION OF THE DRAWINGS

The foregoing summary and detailed description is better understood whenread in conjunction with the accompanying drawings which are included byway of example and not by way of limitation.

FIG. 1 shows a process diagram illustrating one embodiment of the primerpair selection process.

FIG. 2 shows a process diagram illustrating one embodiment of the primerpair validation process. Here select primers are shown meeting testcriteria. Criteria include but are not limited to, the ability toamplify targeted HPV nucleic acid, the ability to exclude non-targetbioagents, the ability to not produce unexpected amplicons, the abilityto not dimerize, the ability to have analytical limits of detection of≦100 genomic copies/reaction, and the ability to differentiate amongstdifferent target organisms.

FIG. 3 shows a process diagram illustrating an embodiment of thecalibration method.

FIG. 4 shows a block diagram showing a representative system.

DETAILED DESCRIPTION OF EMBODIMENTS

It is to be understood that the terminology used herein is for thepurpose of describing particular embodiments only, and is not intendedto be limiting. Further, unless defined otherwise, all technical andscientific terms used herein have the same meaning as commonlyunderstood by one of ordinary skill in the art to which this inventionpertains. In describing and claiming the present invention, thefollowing terminology and grammatical variants will be used inaccordance with the definitions set forth below.

As used herein, the term “about” means encompassing plus or minus 10%.For example, about 200 nucleotides refers to a range encompassingbetween 180 and 220 nucleotides.

As used herein, the term “amplicon” or “bioagent identifying amplicon”refers to a nucleic acid generated using the primer pairs describedherein. The amplicon is typically double stranded DNA; however, it maybe RNA and/or DNA:RNA. In some embodiments, the amplicon comprises DNAcomplementary to HPV RNA, DNA, or cDNA. In some embodiments, theamplicon comprises sequences of conserved regions/primer pairs andintervening variable region. As discussed herein, primer pairs areconfigured to generate amplicons from HPV nucleic acid. As such, thebase composition of any given amplicon may include the primer pair, thecomplement of the primer pair, the conserved regions and the variableregion from the bioagent that was amplified to generate the amplicon.One skilled in the art understands that the incorporation of thedesigned primer pair sequences into an amplicon may replace the nativesequences at the primer binding site, and complement thereof. In certainembodiments, after amplification of the target region using the primersthe resultant amplicons having the primer sequences are used to generatethe molecular mass data. Generally, the amplicon further comprises alength that is compatible with mass spectrometry analysis. Bioagentidentifying amplicons generate base compositions that are preferablyunique to the identity of a bioagent (e.g., HPV).

Amplicons typically comprise from about 45 to about 200 consecutivenucleobases (i.e., from about 45 to about 200 linked nucleosides). Oneof ordinary skill in the art will appreciate that this range expresslyembodies compounds of 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56,57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74,75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92,93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108,109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122,123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136,137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150,151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164,165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178,179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192,193, 194, 195, 196, 197, 198, 199, and 200 nucleobases in length. One ofordinary skill in the art will further appreciate that the above rangeis not an absolute limit to the length of an amplicon, but insteadrepresents a preferred length range. Amplicon lengths falling outside ofthis range are also included herein so long as the amplicon is amenableto calculation of a base composition signature as herein described.

The term “amplifying” or “amplification” in the context of nucleic acidsrefers to the production of multiple copies of a polynucleotide, or aportion of the polynucleotide, typically starting from a small amount ofthe polynucleotide (e.g., a single polynucleotide molecule), where theamplification products or amplicons are generally detectable.Amplification of polynucleotides encompasses a variety of chemical andenzymatic processes. The generation of multiple DNA copies from one or afew copies of a target or template DNA molecule during a polymerasechain reaction (PCR) or a ligase chain reaction (LCR) are forms ofamplification. Amplification is not limited to the strict duplication ofthe starting molecule. For example, the generation of multiple cDNAmolecules from a limited amount of RNA in a sample using reversetranscription (RT)-PCR is a form of amplification. Furthermore, thegeneration of multiple RNA molecules from a single DNA molecule duringthe process of transcription is also a form of amplification.

As used herein, “viral nucleic acid” includes, but is not limited to,DNA, RNA, or DNA that has been obtained from viral RNA, such as, forexample, by performing a reverse transcription reaction. Viral RNA caneither be single-stranded (of positive or negative polarity) ordouble-stranded.

As used herein, the term “base composition” refers to the number of eachresidue comprised in an amplicon or other nucleic acid, withoutconsideration for the linear arrangement of these residues in thestrand(s) of the amplicon. The amplicon residues comprise, adenosine(A), guanosine (G), cytidine, (C), (deoxy)thymidine (T), uracil (U),inosine (1), nitroindoles such as 5-nitroindole or 3-nitropyrrole, dP ordK (Hill F et al. Polymerase recognition of syntheticoligodeoxyribonucleotides incorporating degenerate pyrimidine and purinebases. Proc Natl Acad Sci USA. 1998 Apr. 14; 95(8):4258-63), an acyclicnucleoside analog containing 5-nitroindazole (Van Aerschot et al.,Nucleosides and Nucleotides, 1995, 14, 1053-1056), the purine analog1-(2-deoxy-beta-D-ribofuranosyl)-imidazole-4-carboxamide,2,6-diaminopurine, 5-propynyluracil, 5-propynylcytosine, phenoxazines,including G-clamp, 5-propynyl deoxy-cytidine, deoxy-thymidinenucleotides, 5-propynylcytidine, 5-propynyluridine and mass tag modifiedversions thereof, including 7-deaza-T-deoxyadenosine-5-triphosphate,5-iodo-2′-deoxyuridine-5′-triphosphate,5-bromo-2′-deoxyuridine-5′-triphosphate,5-bromo-2′-deoxycytidine-5′-triphosphate,5-iodo-2′-deoxycytidine-5′-triphosphate,5-hydroxy-2′-deoxyuridine-5′-triphosphate,4-thiothymidine-5′-triphosphate, 5-aza-2′-deoxyuridine-5′-triphosphate,5-fluoro-2′-deoxyuridine-5′-triphosphate,O6-methyl-2′-deoxyguanosine-5′-triphosphate,N2-methyl-2′-deoxyguanosine-5′-triphosphate,8-oxo-2′-deoxyguanosine-5′-triphosphate orthiothymidine-5′-triphosphate. In some embodiments, the mass-modifiednucleobase comprises ¹⁵N or ¹³C or both ¹⁵N and ¹³C. In someembodiments, the non-natural nucleosides used herein include5-propynyluracil, 5-propynylcytosine and inosine. Herein the basecomposition for an unmodified DNA amplicon is notated asA_(w)G_(x)C_(y)T_(z), wherein w, x, y and z are each independently awhole number representing the number of said nucleoside residues in anamplicon. Base compositions for amplicons comprising modifiednucleosides are similarly notated to indicate the number of said naturaland modified nucleosides in an amplicon. Base compositions arecalculated from a molecular mass measurement of an amplicon, asdescribed below. The calculated base composition for any given ampliconis then compared to a database of base compositions. A match between thecalculated base composition and a single database entry reveals theidentity of the bioagent.

As used herein, a “base composition probability cloud” is arepresentation of the diversity in base composition resulting from avariation in sequence that occurs among different isolates of a givenspecies, family or genus. Base composition calculations for a pluralityof amplicons are mapped on a pseudo four-dimensional plot. Relatedmembers in a family, genus or species typically cluster within thisplot, forming a base composition probability cloud.

As used herein, the term “base composition signature” refers to the basecomposition generated by any one particular amplicon.

As used herein, a “bioagent” means any biological organism or componentthereof or a sample containing a biological organism or componentthereof, including microorganisms or infectious substances, or anynaturally occurring, bioengineered or synthesized component of any suchmicroorganism or infectious substance or any nucleic acid derived fromany such microorganism or infectious substance. Those of ordinary skillin the art will understand fully what is meant by the term bioagentgiven the instant disclosure. Still, a non-exhaustive list of bioagentsincludes: cells, cell lines, human clinical samples, mammalian bloodsamples, cell cultures, bacterial cells, viruses, viroids, fungi,protists, parasites, rickettsiae, protozoa, animals, mammals or humans.Samples may be alive, non-replicating or dead or in a vegetative state(for example, vegetative bacteria or spores). Preferably, the bioagentis an HPV such as Alphapapillomavirus or Gammapapillomavirus.

As used herein, a “bioagent division” is defined as group of bioagentsabove the species level and includes but is not limited to, orders,families, genus, classes, clades, genera or other such groupings ofbioagents above the species level.

As used herein, “broad range survey primers” are primers designed toidentify an unknown bioagent as a member of a particular biologicaldivision (e.g., an order, family, class, Glade, or genus). However, insome cases the broad range survey primers are also able to identifyunknown bioagents at the species or sub-species level. As used herein,“division-wide primers” are primers designed to identify a bioagent atthe species level and “drill-down” primers are primers designed toidentify a bioagent at the sub-species level. As used herein, the“sub-species” level of identification includes, but is not limited to,strains, subtypes, variants, and isolates. Drill-down primers are notalways required for identification at the sub-species level becausebroad range survey intelligent primers may, in some cases providesufficient identification resolution to accomplishing thisidentification objective.

As used herein, the terms “complementary” or “complementarity” are usedin reference to polynucleotides (i.e., a sequence of nucleotides)related by the base-pairing rules. For example, the sequence“5′-A-G-T-3′,” is complementary to the sequence “3′-T-C-A-5′.”Complementarity may be “partial,” in which only some of the nucleicacids' bases are matched according to the base pairing rules. Or, theremay be “complete” or “total” complementarity between the nucleic acids.The degree of complementarity between nucleic acid strands hassignificant effects on the efficiency and strength of hybridizationbetween nucleic acid strands. This is of particular importance inamplification reactions, as well as detection methods that depend uponbinding between nucleic acids.

The term “conserved region” in the context of nucleic acids refers to anucleobase sequence (e.g., a subsequence of a nucleic acid, etc.) thatis the same or similar in two or more different regions or segments of agiven nucleic acid molecule (e.g., an intramolecular conserved region),or that is the same or similar in two or more different nucleic acidmolecules (e.g., an intermolecular conserved region). To illustrate, aconserved region may be present in two or more different taxonomic ranks(e.g., two or more different genera, two or more different species, twoor more different subspecies, and the like) or in two or more differentnucleic acid molecules from the same organism. To further illustrate, incertain embodiments, nucleic acids comprising at least one conservedregion typically have between about 70%-100%, between about 80-100%,between about 90-100%, between about 95-100%, or between about 99-100%sequence identity in that conserved region. A conserved region may alsobe selected or identified functionally as a region that permitsgeneration of amplicons via primer extension through hybridization of acompletely or partially complementary primer to the conserved region foreach of the target sequences to which conserved region is conserved.

The term “correlates” refers to establishing a relationship between twoor more things. In certain embodiments, for example, detected molecularmasses of one or more amplicons indicate the presence or identity of agiven bioagent in a sample. In some embodiments, base compositions arecalculated or otherwise determined from the detected molecular masses ofamplicons, which base compositions indicate the presence or identity ofa given bioagent in a sample.

As used herein, in some embodiments the term “database” is used to referto a collection of base composition molecular mass data. In otherembodiments the term “database” is used to refer to a collection of basecomposition data. The base composition data in the database is indexedto bioagents and to primer pairs. The base composition data reported inthe database comprises the number of each nucleoside in an amplicon thatwould be generated for each bioagent using each primer. The database canbe populated by empirical data. In this aspect of populating thedatabase, a bioagent is selected and a primer pair is used to generatean amplicon. The amplicon's molecular mass is determined using a massspectrometer and the base composition calculated therefrom withoutsequencing i.e., without determining the linear sequence of nucleobasescomprising the amplicon. Note that base composition entries in thedatabase may be derived from sequencing data (i.e., known sequenceinformation), but the base composition of the amplicon to be identifiedis determined without sequencing the amplicon. An entry in the databaseis made to associate correlate the base composition with the bioagentand the primer pair used. The database may also be populated using otherdatabases comprising bioagent information. For example, using theGenBank database it is possible to perform electronic PCR using anelectronic representation of a primer pair. This in silico method mayprovide the base composition for any or all selected bioagent(s) storedin the GenBank database. The information may then be used to populatethe base composition database as described above. A base compositiondatabase can be in silico, a written table, a reference book, aspreadsheet or any form generally amenable to databases. Preferably, itis in silico on computer readable media.

The term “detect”, “detecting” or “detection” refers to an act ofdetermining the existence or presence of one or more targets (e.g.,bioagent nucleic acids, amplicons, etc.) in a sample.

As used herein, the term “etiology” refers to the causes or origins, ofdiseases or abnormal physiological conditions.

As used herein, the term “gene” refers to a nucleic acid (e.g., DNA)sequence that comprises coding sequences necessary for the production ofa polypeptide, precursor, or RNA (e.g., rRNA, tRNA). The polypeptide canbe encoded by a full length coding sequence or by any portion of thecoding sequence so long as the desired activity or functional properties(e.g., enzymatic activity, ligand binding, signal transduction,immunogenicity, etc.) of the full-length sequence or fragment thereofare retained.

As used herein, the term “heterologous gene” refers to a gene that isnot in its natural environment. For example, a heterologous geneincludes a gene from one species introduced into another species. Aheterologous gene also includes a gene native to an organism that hasbeen altered in some way (e.g., mutated, added in multiple copies,linked to non-native regulatory sequences, etc). Heterologous genes aredistinguished from endogenous genes in that the heterologous genesequences are typically joined to nucleic acid sequences that are notfound naturally associated with the gene sequences in the chromosome orare associated with portions of the chromosome not found in nature(e.g., genes expressed in loci where the gene is not normallyexpressed).

The terms “homology,” “homologous” and “sequence identity” refer to adegree of identity. There may be partial homology or complete homology.A partially homologous sequence is one that is less than 100% identicalto another sequence. Determination of sequence identity is described inthe following example: a primer 20 nucleobases in length which isotherwise identical to another 20 nucleobase primer but having twonon-identical residues has 18 of 20 identical residues (18/20=0.9 or 90%sequence identity). In another example, a primer 15 nucleobases inlength having all residues identical to a 15 nucleobase segment of aprimer 20 nucleobases in length would have 15/20=0.75 or 75% sequenceidentity with the 20 nucleobase primer. In context of the presentinvention, sequence identity is meant to be properly determined when thequery sequence and the subject sequence are both described and alignedin the 5′ to 3′ direction. Sequence alignment algorithms such as BLAST,will return results in two different alignment orientations. In thePlus/Plus orientation, both the query sequence and the subject sequenceare aligned in the 5′ to 3′ direction. On the other hand, in thePlus/Minus orientation, the query sequence is in the 5′ to 3′ directionwhile the subject sequence is in the 3′ to 5′ direction. It should beunderstood that with respect to the primers of the present invention,sequence identity is properly determined when the alignment isdesignated as Plus/Plus. Sequence identity may also encompass alternateor “modified” nucleobases that perform in a functionally similar mannerto the regular nucleobases adenine, thymine, guanine and cytosine withrespect to hybridization and primer extension in amplificationreactions. In a non-limiting example, if the 5-propynyl pyrimidinespropyne C and/or propyne T replace one or more C or T residues in oneprimer which is otherwise identical to another primer in sequence andlength, the two primers will have 100% sequence identity with eachother. In another non-limiting example, Inosine (1) may be used as areplacement for G or T and effectively hybridize to C, A or U (uracil).Thus, if inosine replaces one or more C, A or U residues in one primerwhich is otherwise identical to another primer in sequence and length,the two primers will have 100% sequence identity with each other. Othersuch modified or universal bases may exist which would perform in afunctionally similar manner for hybridization and amplificationreactions and will be understood to fall within this definition ofsequence identity.

As used herein, “housekeeping gene” or “core viral gene” refers to agene encoding a protein or RNA involved in basic functions required forsurvival and reproduction of a bioagent. Housekeeping genes include, butare not limited to, genes encoding RNA or proteins involved intranslation, replication, recombination and repair, transcription,nucleotide metabolism, amino acid metabolism, lipid metabolism, energygeneration, uptake, secretion and the like.

As used herein, the term “hybridization” or “hybridize” is used inreference to the pairing of complementary nucleic acids. Hybridizationand the strength of hybridization (i.e., the strength of the associationbetween the nucleic acids) is influenced by such factors as the degreeof complementary between the nucleic acids, stringency of the conditionsinvolved, the melting temperature (T_(m)) of the formed hybrid, and theG:C ratio within the nucleic acids. A single molecule that containspairing of complementary nucleic acids within its structure is said tobe “self-hybridized.” An extensive guide to nucleic hybridization may befound in Tijssen, Laboratory Techniques in Biochemistry and MolecularBiology-Hybridization with Nucleic Acid Probes, part I, chapter 2,“Overview of principles of hybridization and the strategy of nucleicacid probe assays,” Elsevier (1993), which is incorporated by reference.

As used herein, the term “primer” refers to an oligonucleotide, whetheroccurring naturally as in a purified restriction digest or producedsynthetically, that is capable of acting as a point of initiation ofsynthesis when placed under conditions in which synthesis of a primerextension product that is complementary to a nucleic acid strand isinduced (e.g., in the presence of nucleotides and an inducing agent suchas a biocatalyst (e.g., a DNA polymerase or the like) and at a suitabletemperature and pH). The primer is typically single stranded for maximumefficiency in amplification, but may alternatively be double stranded.If double stranded, the primer is generally first treated to separateits strands before being used to prepare extension products. In someembodiments, the primer is an oligodeoxyribonucleotide. The primer issufficiently long to prime the synthesis of extension products in thepresence of the inducing agent. The exact lengths of the primers willdepend on many factors, including temperature, source of primer and theuse of the method.

As used herein, “intelligent primers” or “primers” or “primer pairs,” insome embodiments, are oligonucleotides that are designed to bind toconserved sequence regions of one or more bioagent nucleic acids togenerate bioagent identifying amplicons. In some embodiments, the boundprimers flank an intervening variable region between the conservedbinding sequences. Upon amplification, the primer pairs yield ampliconse.g., amplification products that provide base composition variabilitybetween the two or more bioagents. The variability of the basecompositions allows for the identification of one or more individualbioagents from, e.g., two or more bioagents based on the basecomposition distinctions. In some embodiments, the primer pairs are alsoconfigured to generate amplicons amenable to molecular mass analysis.Further, the sequences of the primer members of the primer pairs are notnecessarily fully complementary to the conserved region of the referencebioagent. For example, in some embodiments, the sequences are designedto be “best fit” amongst a plurality of bioagents at these conservedbinding sequences. Therefore, the primer members of the primer pairshave substantial complementarity with the conserved regions of thebioagents, including the reference bioagent.

In some embodiments of the invention, the oligonucleotide primer pairsdescribed herein can be purified. As used herein, “purifiedoligonucleotide primer pair,” “purified primer pair,” or “purified”means an oligonucleotide primer pair that is chemically-synthesized tohave a specific sequence and a specific number of linked nucleosides.This term is meant to explicitly exclude nucleotides that are generatedat random to yield a mixture of several compounds of the same lengtheach with randomly generated sequence. As used herein, the term“purified” or “to purify” refers to the removal of one or morecomponents (e.g., contaminants) from a sample.

As used herein, the term “molecular mass” refers to the mass of acompound as determined using mass spectrometry, for example, ESI-MS.Herein, the compound is preferably a nucleic acid. In some embodiments,the nucleic acid is a double stranded nucleic acid (e.g., a doublestranded DNA nucleic acid). In some embodiments, the nucleic acid is anamplicon. When the nucleic acid is double stranded the molecular mass isdetermined for both strands. In one embodiment, the strands may beseparated before introduction into the mass spectrometer, or the strandsmay be separated by the mass spectrometer (for example, electro-sprayionization will separate the hybridized strands). The molecular mass ofeach strand is measured by the mass spectrometer.

As used herein, the term “nucleic acid molecule” refers to any nucleicacid containing molecule, including but not limited to, DNA or RNA. Theterm encompasses sequences that include any of the known base analogs ofDNA and RNA including, but not limited to, 4 acetylcytosine,8-hydroxy-N-6-methyladenosine, aziridinylcytosine, pseudoisocytosine,5-(carboxyhydroxyl-methyl) uracil, 5-fluorouracil, 5-bromouracil,5-carboxymethylaminomethyl-2-thiouracil,5-carboxymethyl-aminomethyluracil, dihydrouracil, inosine,N6-isopentenyladenine, 1-methyladenine, 1-methylpseudo-uracil,1-methylguanine, 1-methylinosine, 2,2-dimethyl-guanine, 2-methyladenine,2-methylguanine, 3-methyl-cytosine, 5-methylcytosine, N6-methyladenine,7-methylguanine, 5-methylaminomethyluracil,5-methoxy-amino-methyl-2-thiouracil, beta-D-mannosylqueosine,5′-methoxycarbonylmethyluracil, 5-methoxyuracil,2-methylthio-N-isopentenyladenine, uracil-5-oxyacetic acid methylester,uracil-5-oxyacetic acid, oxybutoxosine, pseudouracil, queosine,2-thiocytosine, 5-methyl-2-thiouracil, 2-thiouracil, 4-thiouracil,5-methyluracil, N-uracil-5-oxyacetic acid methylester,uracil-5-oxyacetic acid, pseudouracil, queosine, 2-thiocytosine, and2,6-diaminopurine.

As used herein, the term “nucleobase” is synonymous with other terms inuse in the art including “nucleotide,” “deoxynucleotide,” “nucleotideresidue,” “deoxynucleotide residue,” “nucleotide triphosphate (NTP),” ordeoxynucleotide triphosphate (dNTP). As is used herein, a nucleobaseincludes natural and modified residues, as described herein.

An “oligonucleotide” refers to a nucleic acid that includes at least twonucleic acid monomer units (e.g., nucleotides), typically more thanthree monomer units, and more typically greater than ten monomer units.The exact size of an oligonucleotide generally depends on variousfactors, including the ultimate function or use of the oligonucleotide.To further illustrate, oligonucleotides are typically less than 200residues long (e.g., between 15 and 100), however, as used herein, theterm is also intended to encompass longer polynucleotide chains.Oligonucleotides are often referred to by their length. For example a 24residue oligonucleotide is referred to as a “24-mer”. Typically, thenucleoside monomers are linked by phosphodiester bonds or analogsthereof, including phosphorothioate, phosphorodithioate,phosphoroselenoate, phosphorodiselenoate, phosphoroanilothioate,phosphoranilidate, phosphoramidate, and the like, including associatedcounterions, e.g., H⁺, NH₄ ⁺, Na⁺, and the like, if such counterions arepresent. Further, oligonucleotides are typically single-stranded.Oligonucleotides are optionally prepared by any suitable method,including, but not limited to, isolation of an existing or naturalsequence, DNA replication or amplification, reverse transcription,cloning and restriction digestion of appropriate sequences, or directchemical synthesis by a method such as the phosphotriester method ofNarang et al. (1979) Meth Enzymol. 68: 90-99; the phosphodiester methodof Brown et al. (1979) Meth Enzymol. 68: 109-151; thediethylphosphoramidite method of Beaucage et al. (1981) TetrahedronLett. 22: 1859-1862; the triester method of Matteucci et al. (1981) J AmChem. Soc. 103:3185-3191; automated synthesis methods; or the solidsupport method of U.S. Pat. No. 4,458,066, entitled “PROCESS FORPREPARING POLYNUCLEOTIDES,” issued Jul. 3, 1984 to Caruthers et al., orother methods known to those skilled in the art. All of these referencesare incorporated by reference.

As used herein a “sample” refers to anything capable of being analyzedby the methods provided herein. In some embodiments, the samplecomprises or is suspected to comprise one or more nucleic acids capableof analysis by the methods. Preferably, the samples comprise nucleicacids (e.g., DNA, RNA, cDNAs, etc.) from one or more HPV. Samples caninclude, for example, blood, saliva, urine, feces, anorectal swabs,vaginal swabs, cervical swabs, and the like. In some embodiments, thesamples are “mixture” samples, which comprise nucleic acids from morethan one subject or individual. In some embodiments, the methodsprovided herein comprise purifying the sample or purifying the nucleicacid(s) from the sample. In some embodiments, the sample is purifiednucleic acid.

A “sequence” of a biopolymer refers to the order and identity of monomerunits (e.g., nucleotides, etc.) in the biopolymer. The sequence (e.g.,base sequence) of a nucleic acid is typically read in the 5′ to 3′direction.

As is used herein, the term “single primer pair identification” meansthat one or more bioagents can be identified using a single primer pair.A base composition signature for an amplicon may singly identify one ormore bioagents.

As used herein, a “sub-species characteristic” is a geneticcharacteristic that provides the means to distinguish two members of thesame bioagent species. For example, one viral strain may bedistinguished from another viral strain of the same species bypossessing a genetic change (e.g., for example, a nucleotide deletion,addition or substitution) in one of the viral genes, such as theRNA-dependent RNA polymerase.

As used herein, in some embodiments the term “substantialcomplementarity” means that a primer member of a primer pair comprisesbetween about 70%-100%, or between about 80-100%, or between about90-100%, or between about 95-100%, or between about 99-100%complementarity with the conserved binding sequence of a nucleic acidfrom a given bioagent. Similarly, the primer pairs provided herein maycomprise between about 70%-100%, or between about 80-100%, or betweenabout 90-100%, or between about 95-100% identity, or between about99-100% sequence identity with the primer pairs disclosed in Tables 1and 2. These ranges of complementarity and identity are inclusive of allwhole or partial numbers embraced within the recited range numbers. Forexample, and not limitation, 75.667%, 82%, 91.2435% and 97%complementarity or sequence identity are all numbers that fall withinthe above recited range of 70% to 100%, therefore forming a part of thisdescription. In some embodiments, any oligonucleotide primer pair mayhave one or both primers with less then 70% sequence homology with acorresponding member of any of the primer pairs of Tables 1 and 2 if theprimer pair has the capability of producing an amplification productcorresponding to the desired HPV identifying amplicon.

A “system” in the context of analytical instrumentation refers a groupof objects and/or devices that form a network for performing a desiredobjective.

As used herein, “triangulation identification” means the use of morethan one primer pair to generate a corresponding amplicon foridentification of a bioagent. The more than one primer pair can be usedin individual wells or vessels or in a multiplex PCR assay.Alternatively, PCR reactions may be carried out in single wells orvessels comprising a different primer pair in each well or vessel.Following amplification the amplicons are pooled into a single well orcontainer which is then subjected to molecular mass analysis. Thecombination of pooled amplicons can be chosen such that the expectedranges of molecular masses of individual amplicons are not overlappingand thus will not complicate identification of signals. Triangulation isa process of elimination, wherein a first primer pair identifies that anunknown bioagent may be one of a group of bioagents. Subsequent primerpairs are used in triangulation identification to further refine theidentity of the bioagent amongst the subset of possibilities generatedwith the earlier primer pair. Triangulation identification is completewhen the identity of the bioagent is determined. The triangulationidentification process may also be used to reduce false negative andfalse positive signals, and enable reconstruction of the origin ofhybrid or otherwise engineered bioagents. For example, identification ofthe three part toxin genes typical of B. anthracis (Bowen et al., J ApplMicrobiol., 1999, 87, 270-278) in the absence of the expectedcompositions from the B. anthracis genome would suggest a geneticengineering event.

As used herein, the term “unknown bioagent” can mean, for example: (i) abioagent whose existence is not known (for example, the SARS coronaviruswas unknown prior to April 2003) and/or (ii) a bioagent whose existenceis known (such as the well known bacterial species Staphylococcus aureusfor example) but which is not known to be in a sample to be analyzed.For example, if the method for identification of coronaviruses disclosedin commonly owned U.S. patent Ser. No. 10/829,826 (incorporated hereinby reference in its entirety) was to be employed prior to April 2003 toidentify the SARS coronavirus in a clinical sample, both meanings of“unknown” bioagent are applicable since the SARS coronavirus was unknownto science prior to April, 2003 and since it was not known what bioagent(in this case a coronavirus) was present in the sample. On the otherhand, if the method of U.S. patent Ser. No. 10/829,826 was to beemployed subsequent to April 2003 to identify the SARS coronavirus in aclinical sample, the second meaning (ii) of “unknown” bioagent wouldapply because the SARS coronavirus became known to science subsequent toApril 2003 because it was not known what bioagent was present in thesample.

As used herein, the term “variable region” is used to describe a regionthat falls between any one primer pair described herein. The regionpossesses distinct base compositions between at least two bioagents,such that at least one bioagent can be identified at, for example, thefamily, genus, species or sub-species level. The degree of variabilitybetween the at least two bioagents need only be sufficient to allow foridentification using mass spectrometry analysis, as described herein.

As used herein, a “wobble base” is a variation in a codon found at thethird nucleotide position of a DNA triplet. Variations in conservedregions of sequence are often found at the third nucleotide position dueto redundancy in the amino acid code.

Provided herein are methods, compositions, kits, and related systems forthe detection and identification of bioagents (e.g., species of HPV)using bioagent identifying amplicons. In some embodiments, primers areselected to hybridize to conserved sequence regions of nucleic acidsderived from a bioagent and which flank variable sequence regions toyield a bioagent identifying amplicon which can be amplified and whichis amenable to molecular mass determination. In some embodiments, themolecular mass is converted to a base composition, which indicates thenumber of each nucleotide in the amplicon. Systems employing softwareand hardware useful in converting molecular mass data into basecomposition information are available from, for example, IbisBiosciences, Inc. (Carlsbad, Calif.), for example the Ibis T5000Biosensor System, and are described in U.S. patent application Ser. No.10/754,415, filed Jan. 9, 2004, incorporated by reference herein in itsentirety. In some embodiments, the molecular mass or corresponding basecomposition of one or more different amplicons is queried against adatabase of molecular masses or base compositions indexed to bioagentsand to the primer pair used to generate the amplicon. A match of themeasured base composition to a database entry base compositionassociates the sample bioagent to an indexed bioagent in the database.Thus, the identity of the unknown bioagent is determined. No priorknowledge of the unknown bioagent is necessary to make anidentification. In some instances, the measured base compositionassociates with more than one database entry base composition. Thus, asecond/subsequent primer pair is generally used to generate an amplicon,and its measured base composition is similarly compared to the databaseto determine its identity in triangulation identification. Furthermore,the methods and other aspects of the invention can be applied to rapidparallel multiplex analyses, the results of which can be employed in atriangulation identification strategy. Thus, in some embodiments, thepresent invention provides rapid throughput and does not require nucleicacid sequencing or knowledge of the linear sequences of nucleobases ofthe amplified target sequence for bioagent detection and identification.

Particular embodiments of the mass-spectrum based detection methods aredescribed in the following patents, patent applications and scientificpublications, all of which are herein incorporated by reference as iffully set forth herein: U.S. Pat. Nos. 7,108,974; 7,217,510; 7,226,739;7,255,992; 7,312,036; 7,339,051; US patent publication numbers2003/0027135; 2003/0167133; 2003/0167134; 2003/0175695; 2003/0175696;2003/0175697; 2003/0187588; 2003/0187593; 2003/0190605; 2003/0225529;2003/0228571; 2004/0110169; 2004/0117129; 2004/0121309; 2004/0121310;2004/0121311; 2004/0121312; 2004/0121313; 2004/0121314; 2004/0121315;2004/0121329; 2004/0121335; 2004/0121340; 2004/0122598; 2004/0122857;2004/0161770; 2004/0185438; 2004/0202997; 2004/0209260; 2004/0219517;2004/0253583; 2004/0253619; 2005/0027459; 2005/0123952; 2005/01301962005/0142581; 2005/0164215; 2005/0266397; 2005/0270191; 2006/0014154;2006/0121520; 2006/0205040; 2006/0240412; 2006/0259249; 2006/0275749;2006/0275788; 2007/0087336; 2007/0087337; 2007/0087338 2007/0087339;2007/0087340; 2007/0087341; 2007/0184434; 2007/0218467; 2007/0218467;2007/0218489; 2007/0224614; 2007/0238116; 2007/0243544; 2007/0248969;WO2002/070664; WO2003/001976; WO2003/100035; WO2004/009849;WO2004/052175; WO2004/053076; WO2004/053141; WO2004/053164;WO2004/060278; WO2004/093644; WO 2004/101809; WO2004/111187;WO2005/023083; WO2005/023986; WO2005/024046; WO2005/033271;WO2005/036369; WO2005/086634; WO2005/089128; WO2005/091971;WO2005/092059; WO2005/094421; WO2005/098047; WO2005/116263;WO2005/117270; WO2006/019784; WO2006/034294; WO2006/071241;WO2006/094238; WO2006/116127; WO2006/135400; WO2007/014045;WO2007/047778; WO2007/086904; WO2007/100397; WO2007/118222; Ecker etal., Ibis T5000: a universal biosensor approach for microbiology. NatRev Microbiol. 2008 Jun. 3; Ecker et al., The Microbial Rosetta StoneDatabase: A compilation of global and emerging infectious microorganismsand bioterrorist threat agents. BMC Microbiology. 2005. 5(1): 19; Eckeret al., The Ibis T5000 Universal Biosensor: An Automated Platform forPathogen Identification and Strain Typing. JALA. 2006. 6(11): 341-351;Ecker et al., The Microbial Rosetta Stone Database: A common structurefor microbial biosecurity threat agents. J Forensic Sci. 2005. 50(6):1380-5; Ecker et al., Identification of Acinetobacter species andgenotyping of Acinetobacter baumannii by multilocus PCR and massspectrometry. J Clin Microbiol. 2006 August; 44(8):2921-32; Ecker etal., Rapid identification and strain-typing of respiratory pathogens forepidemic surveillance. Proc Natl Acad Sci USA. 2005 May 31;102(22):8012-7. Epub 2005 May 23; Wortmann et al., Genotypic evolutionof Acinetobacter baumannii Strains in an outbreak associated with wartrauma, Infect Control Hosp Epidemiol. 2008 June; 29(6):553-555; Hanniset al., High-resolution genotyping of Campylobacter species by use ofPCR and high-throughput mass spectrometry. J Clin Microbiol. 2008 April;46(4): 1220-5; Blyn et al., Rapid detection and molecular serotyping ofadenovirus by use of PCR followed by electrospray ionization massspectrometry. J Clin Microbiol. 2008 February; 46(2):644-51; Eshoo etal., Direct broad-range detection of alphaviruses in mosquito extracts,Virology. Nov. 25; 368(2):286-95; Sampath et al., Global surveillance ofemerging Influenza virus genotypes by mass spectrometry. PLoS ONE. 2007May 30; 2(5):e489; Sampath et al., Rapid identification of emerginginfectious agents using PCR and electrospray ionization massspectrometry. Ann N Y Acad. Sci. 2007 April; 1102: 109-20; Hujer et al.,Analysis of antibiotic resistance genes in multidrug-resistantAcinetobacter sp. isolates from military and civilian patients treatedat the Walter Reed Army Medical Center. Antimicrob Agents Chemother.2006 December; 50(12):4114-23; Hall et al., Base composition analysis ofhuman mitochondrial DNA using electrospray ionization mass spectrometry:a novel tool for the identification and differentiation of humans. AnalBiochem. 2005 Sep. 1; 344(1):53-69; Sampath et al., Rapid identificationof emerging pathogens: coronavirus. Emerg Infect Dis. 2005 March;11(3):373-9; Jiang Y, Hofstadler S A. A highly efficient and automatedmethod of purifying and desalting PCR products for analysis byelectrospray ionization mass spectrometry. Anal Biochem. 2003. 316:50-57; Jiang et al., Mitochondrial DNA mutation detection byelectrospray mass spectrometry. Clin Chem. 2006. 53(2): 195-203. EpubDecember 7; Russell et al., Transmission dynamics and prospectiveenvironmental sampling of adenovirus in a military recruit setting. JInfect Dis. 2006. 194(7): 877-85. Epub 2006 Aug. 25; Hofstadler et al.,Detection of microbial agents using broad-range PCR with detection bymass spectrometry: The TIGER concept. Chapter in Encyclopedia of RapidMicrobiological Methods. 2006; Hofstadler et al., Selective ionfiltering by digital thresholding: A method to unwind complex ESI-massspectra and eliminate signals from low molecular weight chemical noise.Anal Chem.

2006. 78(2): 372-378; Hofstadler et al., TIGER: The Universal Biosensor.Int J Mass Spectrom. 2005. 242(1): 23-41; Van Ert et al., Massspectrometry provides accurate characterization of two genetic markertypes in Bacillus anthracis. Biotechniques. 2004. 37(4): 642-4, 646,648; Sampath et al., Forum on Microbial Threats: Learning from SARS:Preparing for the Next Disease Outbreak—Workshop Summary (ed. Knobler SE, Mahmoud A, Lemon S.) The National Academies Press, Washington, D.C.2004. 181-185.

In certain embodiments, bioagent identifying amplicons amenable tomolecular mass determination produced by the primers described hereinare either of a length, size or mass compatible with a particular modeof molecular mass determination, or compatible with a means of providinga fragmentation pattern in order to obtain fragments of a lengthcompatible with a particular mode of molecular mass determination. Suchmeans of providing a fragmentation pattern of an amplicon include, butare not limited to, cleavage with restriction enzymes or cleavageprimers, sonication or other means of fragmentation. Thus, in someembodiments, bioagent identifying amplicons are larger than 200nucleobases and are amenable to molecular mass determination followingrestriction digestion. Methods of using restriction enzymes and cleavageprimers are well known to those with ordinary skill in the art.

In some embodiments, amplicons corresponding to bioagent identifyingamplicons are obtained using the polymerase chain reaction (PCR). Otheramplification methods may be used such as ligase chain reaction (LCR),low-stringency single primer PCR, and multiple strand displacementamplification (MDA). (Michael, S F., Biotechniques (1994), 16:411-412and Dean et al., Proc. Natl. Acad. Sci. U.S.A. (2002), 99, 5261-5266).

One embodiment of a process flow diagram used for primer selection andvalidation process is depicted in FIGS. 1 and 2. For each group oforganisms, candidate target sequences are identified (200) from whichnucleotide sequence alignments are created (210) and analyzed (220).Primers are then configured by selecting priming regions (230) tofacilitate the selection of candidate primer pairs (240). The primerpair sequence is typically a “best fit” amongst the aligned sequences,such that the primer pair sequence may or may not be fully complementaryto the hybridization region on any one of the bioagents in thealignment. Thus, best fit primer pair sequences are those withsufficient complementarity with two or more bioagents to hybridize withthe two or more bioagents and generate an amplicon. The primer pairs arethen subjected to in silico analysis by electronic PCR (ePCR) (300)wherein bioagent identifying amplicons are obtained from sequencedatabases such as GenBank or other sequence collections (310) and testedfor specificity in silico (320). Bioagent identifying amplicons obtainedfrom ePCR of GenBank sequences (310) may also be analyzed by aprobability model which predicts the capability of a given amplicon toidentify unknown bioagents. Preferably, the base compositions ofamplicons with favorable probability scores are then stored in a basecomposition database (325). Alternatively, base compositions of thebioagent identifying amplicons obtained from the primers and GenBanksequences are directly entered into the base composition database (330).Candidate primer pairs (240) are validated by in vitro amplification bya method such as PCR analysis (400) of nucleic acid from a collection oforganisms (410). Amplicons thus obtained are analyzed to confirm thesensitivity, specificity and reproducibility of the primers used toobtain the amplicons (420).

Synthesis of primers is well known and routine in the art. The primersmay be conveniently and routinely made through the well-known techniqueof solid phase synthesis. Equipment for such synthesis is sold byseveral vendors including, for example, Applied Biosystems (Foster City,Calif.). Any other means for such synthesis known in the art mayadditionally or alternatively be employed.

The primers typically are employed as compositions for, use in methodsfor identification of bioagents as follows: a primer pair composition iscontacted with nucleic acid of an unknown isolate suspected ofcomprising HPV. The nucleic acid is then amplified by a nucleic acidamplification technique, such as PCR for example, to obtain an ampliconthat represents a bioagent identifying amplicon. The molecular mass ofthe strands of the double-stranded amplicon is determined by a molecularmass measurement technique such as mass spectrometry, for example.Preferably the two strands of the double-stranded amplicon are separatedduring the ionization process; however, they may be separated prior tomass spectrometry measurement. In some embodiments, the massspectrometer is electrospray Fourier transform ion cyclotron resonancemass spectrometry (ESI-FTICR-MS) or electrospray time of flight massspectrometry (ESI-TOF-MS). A list of possible base compositions may begenerated for the molecular mass value obtained for each strand, and thechoice of the base composition from the list is facilitated by matchingthe base composition of one strand with a complementary base compositionof the other strand. A measured molecular mass or base compositioncalculated therefrom is then compared with a database of molecularmasses or base compositions indexed to primer pairs and to knownbioagents. A match between the measured molecular mass or basecomposition of the amplicon and the database molecular mass or basecomposition for that indexed primer pair correlates the measuredmolecular mass or base composition with an indexed bioagent, thusidentifying the unknown bioagent (e.g. the species of HPV). In someembodiments, the primer pair used is at least one of the primer pairs ofTables 1 and 2. In some embodiments, the method is repeated using adifferent primer pair to resolve possible ambiguities in theidentification process or to improve the confidence level for theidentification assignment (triangulation identification). In someembodiments, for example, where the unknown is a novel, previouslyuncharacterized organism, the molecular mass or base composition from anamplicon generated from the unknown is matched with one or more bestmatch molecular masses or base compositions from a database to predict afamily, genus, species, sub-type, etc. of the unknown. Such informationmay assist further characterization of the unknown or provide aphysician treating a patient infected by the unknown with a therapeuticagent best calculated to treat the patient.

In certain embodiments, HPV is detected with the systems and methods ofthe present invention in combination with other bioagents, includingviruses, bacteria, fungi, or other bioagents. In particular embodiments,a panel is employed that includes HPV and other related or un-relatedbioagents. Such panels may be specific for a particular type ofbioagent, or specific for a specific type of test (e.g., for testing thesafety of blood, one may include commonly present viral pathogens suchas HCV, HIV, and bacteria that can be contracted via a bloodtransfusion).

In some embodiments, a bioagent identifying amplicon may be producedusing only a single primer (either the forward or reverse primer of anygiven primer pair), provided an appropriate amplification method ischosen, such as, for example, low stringency single primer PCR(LSSP-PCR).

In some embodiments, the oligonucleotide primers are broad range surveyprimers which hybridize to conserved regions of nucleic acid. The broadrange primer may identify the unknown bioagent depending on whichbioagent is in the sample. In other cases, the molecular mass or basecomposition of an amplicon does not provide sufficient resolution toidentify the unknown bioagent as any one bioagent at or below thespecies level. These cases generally benefit from further analysis ofone or more amplicons generated from at least one additional broad rangesurvey primer pair, or from at least one additional division-wide primerpair, or from at least one additional drill-down primer pair.Identification of sub-species characteristics may be required, forexample, to determine a clinical treatment of patient, or in rapidlyresponding to an outbreak of a new species, sub-type, etc. of pathogento prevent an epidemic or pandemic.

One with ordinary skill in the art of design of amplification primerswill recognize that a given primer need not hybridize with 100%complementarity in order to effectively prime the synthesis of acomplementary nucleic acid strand in an amplification reaction. Primerpair sequences may be a “best fit” amongst the aligned bioagentsequences, thus they need not be fully complementary to thehybridization region of any one of the bioagents in the alignment.Moreover, a primer may hybridize over one or more segments such thatintervening or adjacent segments are not involved in the hybridizationevent (e.g., for example, a loop structure or a hairpin structure). Theprimers may comprise at least 70%, at least 75%, at least 80%, at least85%, at least 90%, at least 95% or at least 99% sequence identity withany of the primers listed in Tables 1 and 2. Thus, in some embodiments,an extent of variation of 70% to 100%, or any range falling within, ofthe sequence identity is possible relative to the specific primersequences disclosed herein. To illustrate, determination of sequenceidentity is described in the following example: a primer 20 nucleobasesin length which is identical to another 20 nucleobase primer having twonon-identical residues has 18 of 20 identical residues (18/20=0.9 or 90%sequence identity). In another example, a primer 15 nucleobases inlength having all residues identical to a 15 nucleobase segment ofprimer 20 nucleobases in length would have 15/20=0.75 or 75% sequenceidentity with the 20 nucleobase primer. Percent identity need not be awhole number, for example when a 28 consecutive nucleobase primer iscompletely identical to a 31 consecutive nucleobase primer (28/31=0.9032or 90.3% identical).

Percent homology, sequence identity or complementarity, can bedetermined by, for example, the Gap program (Wisconsin Sequence AnalysisPackage, Version 8 for Unix, Genetics Computer Group, UniversityResearch Park, Madison Wis.), using default settings, which uses thealgorithm of Smith and Waterman (Adv. Appl. Math., 1981, 2, 482-489). Insome embodiments, complementarity of primers with respect to theconserved priming regions of viral nucleic acid, is between about 70%and about 80%. In other embodiments, homology, sequence identity orcomplementarity, is between about 80% and about 90%. In yet otherembodiments, homology, sequence identity or complementarity, is at least90%, at least 92%, at least 94%, at least 95%, at least 96%, at least97%, at least 98%, at least 99% or is 100%.

In some embodiments, the primers described herein comprise at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 92%, atleast 94%, at least 95%, at least 96%, at least 98%, or at least 99%, or100% (or any range falling within) sequence identity with the primersequences specifically disclosed herein.

In some embodiments, the oligonucleotide primers are 13 to 35nucleobases in length (13 to 35 linked nucleotide residues). Theseembodiments comprise oligonucleotide primers 13, 14, 15, 16, 17, 18, 19,20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34 or 35nucleobases in length, or any range therewithin.

In some embodiments, any given primer comprises a modificationcomprising the addition of a non-templated T residue to the 5′ end ofthe primer (i.e., the added T residue does not necessarily hybridize tothe nucleic acid being amplified). The addition of a non-templated Tresidue has an effect of minimizing the addition of non-templated Aresidues as a result of the non-specific enzyme activity of, e.g., TaqDNA polymerase (Magnuson et al., Biotechniques, 1996, 21, 700-709), anoccurrence which may lead to ambiguous results arising from molecularmass analysis.

Primers may contain one or more universal bases. Because any variation(due to codon wobble in the third position) in the conserved regionsamong species is likely to occur in the third position of a DNA (or RNA)triplet, oligonucleotide primers can be designed such that thenucleotide corresponding to this position is a base which can bind tomore than one nucleotide, referred to herein as a “universalnucleobase.” For example, under this “wobble” base pairing, inosine (1)binds to U, C or A; guanine (G) binds to U or C, and uridine (U) bindsto U or C. Other examples of universal nucleobases include nitroindolessuch as 5-nitroindole or 3-nitropyrrole (Loakes et al., Nucleosides andNucleotides, 1995, 14, 1001-1003), the degenerate nucleotides dP or dK,an acyclic nucleoside analog containing 5-nitroindazole (Van Aerschot etal., Nucleosides and Nucleotides., 1995, 14, 1053-1056) or the purineanalog 1-(2-deoxy-beta-D-ribofuranosyl)-imidazole-4-carboxamide (Sala etal., Nucl. Acids Res., 1996, 24, 3302-3306).

In some embodiments, to compensate for weaker binding by the wobblebase, oligonucleotide primers are configured such that the first andsecond positions of each triplet are occupied by nucleotide analogswhich bind with greater affinity than the unmodified nucleotide.Examples of these analogs include, but are not limited to,2,6-diaminopurine which binds to thymine, 5-propynyluracil which bindsto adenine and 5-propynylcytosine and phenoxazines, including G-clamp,which binds to G. Propynylated pyrimidines are described in U.S. Pat.Nos. 5,645,985, 5,830,653 and 5,484,908, each of which is commonly ownedand incorporated herein by reference in its entirety. Propynylatedprimers are described in U.S Pre-Grant Publication No. 2003-0170682;also commonly owned and incorporated herein by reference in itsentirety. Phenoxazines are described in U.S. Pat. Nos. 5,502,177,5,763,588, and 6,005,096, each of which is incorporated herein byreference in its entirety. G-clamps are described in U.S. Pat. Nos.6,007,992 and 6,028,183, each of which is incorporated herein byreference in its entirety.

In some embodiments, non-template primer tags are used to increase themelting temperature (T_(m)) of a primer-template duplex in order toimprove amplification efficiency. A non-template tag is at least threeconsecutive A or T nucleotide residues on a primer which are notcomplementary to the template. In any given non-template tag, A can bereplaced by C or G and T can also be replaced by C or G. AlthoughWatson-Crick hybridization is not expected to occur for a non-templatetag relative to the template, the extra hydrogen bond in a G-C pairrelative to an A-T pair confers increased stability of theprimer-template duplex and improves amplification efficiency forsubsequent cycles of amplification when the primers hybridize to strandssynthesized in previous cycles.

In other embodiments, propynylated tags may be used in a manner similarto that of the non-template tag, wherein two or more 5-propynylcytidineor 5-propynyluridine residues replace template matching residues on aprimer. In other embodiments, a primer contains a modifiedinternucleoside linkage such as a phosphorothioate linkage, for example.

In some embodiments, the primers contain mass-modifying tags. Reducingthe total number of possible base compositions of a nucleic acid ofspecific molecular weight provides a means of avoiding a possible sourceof ambiguity in the determination of base composition of amplicons.Addition of mass-modifying tags to certain nucleobases of a given primerwill result in simplification of de novo determination of basecomposition of a given bioagent identifying amplicon from its molecularmass.

In some embodiments, the mass modified nucleobase comprises one or moreof the following: for example, 7-deaza-2′-deoxyadenosine-5-triphosphate,5-iodo-2′-deoxyuridine-5′-triphosphate,5-bromo-2′-deoxyuridine-5′-triphosphate,5-bromo-2′-deoxycytidine-5′-triphosphate,5-iodo-2′-deoxycytidine-5′-triphosphate,5-hydroxy-2′-deoxyuridine-5′-triphosphate,4-thiothymidine-5′-triphosphate, 5-aza-2′-deoxyuridine-5′-triphosphate,5-fluoro-2′-deoxyuridine-5′-triphosphate,O6-methyl-2′-deoxyguanosine-5′-triphosphate,N2-methyl-2′-deoxyguanosine-5′-triphosphate,8-oxo-2′-deoxyguanosine-5′-triphosphate orthiothymidine-5′-triphosphate. In some embodiments, the mass-modifiednucleobase comprises ¹⁵N or ¹³C or both ¹³N and ¹³C.

In some embodiments, the molecular mass of a given bioagent (e.g., aspecies of HPV) identifying amplicon is determined by mass spectrometry.Mass spectrometry is intrinsically a parallel detection scheme withoutthe need for radioactive or fluorescent labels, because an amplicon isidentified by its molecular mass. The current state of the art in massspectrometry is such that less than femtomole quantities of material canbe analyzed to provide information about the molecular contents of thesample. An accurate assessment of the molecular mass of the material canbe quickly obtained, irrespective of whether the molecular weight of thesample is several hundred, or in excess of one hundred thousand atomicmass units (amu) or Daltons.

In some embodiments, intact molecular ions are generated from ampliconsusing one of a variety of ionization techniques to convert the sample tothe gas phase. These ionization methods include, but are not limited to,electrospray ionization (ESI), matrix-assisted laser desorptionionization (MALDI) and fast atom bombardment (FAB). Upon ionization,several peaks are observed from one sample due to the formation of ionswith different charges. Averaging the multiple readings of molecularmass obtained from a single mass spectrum affords an estimate ofmolecular mass of the bioagent identifying amplicon. Electrosprayionization mass spectrometry (ESI-MS) is particularly useful for veryhigh molecular weight polymers such as proteins and nucleic acids havingmolecular weights greater than 10 kDa, since it yields a distribution ofmultiply-charged molecules of the sample without causing a significantamount of fragmentation.

The mass detectors used include, but are not limited to, Fouriertransform ion cyclotron resonance mass spectrometry (FT-ICR-MS), time offlight (TOF), ion trap, quadrupole, magnetic sector, Q-TOF, and triplequadrupole.

In some embodiments, assignment of previously unobserved basecompositions (also known as “true unknown base compositions”) to a givenphylogeny can be accomplished via the use of pattern classifier modelalgorithms. Base compositions, like sequences, may vary slightly fromstrain to strain within species, for example. In some embodiments, thepattern classifier model is the mutational probability model. In otherembodiments, the pattern classifier is the polytope model. A polytopemodel is the mutational probability model that incorporates both therestrictions among strains and position dependence of a given nucleobasewithin a triplet. In certain embodiments, a polytope pattern classifieris used to classify a test or unknown organism according to its ampliconbase composition.

In some embodiments, it is possible to manage this diversity by building“base composition probability clouds” around the composition constraintsfor each species. A “pseudo four-dimensional plot” may be used tovisualize the concept of base composition probability clouds. Optimalprimer design typically involves an optimal choice of bioagentidentifying amplicons and maximizes the separation between the basecomposition signatures of individual bioagents. Areas where cloudsoverlap generally indicate regions that may result in amisclassification, a problem which is overcome by a triangulationidentification process using bioagent identifying amplicons not affectedby overlap of base composition probability clouds.

In some embodiments, base composition probability clouds provide themeans for screening potential primer pairs in order to avoid potentialmisclassifications of base compositions. In other embodiments, basecomposition probability clouds provide the means for predicting theidentity of an unknown bioagent whose assigned base composition has notbeen previously observed and/or indexed in a bioagent identifyingamplicon base composition database due to evolutionary transitions inits nucleic acid sequence. Thus, in contrast to probe-based techniques,mass spectrometry determination of base composition does not requireprior knowledge of the composition or sequence in order to make themeasurement.

Provided herein is bioagent classifying information at a levelsufficient to identify a given bioagent. Furthermore, the process ofdetermining a previously unknown base composition for a given bioagent(for example, in a case where sequence information is unavailable) hasutility by providing additional bioagent indexing information with whichto populate base composition databases. The process of future bioagentidentification is thus improved as additional base composition signatureindexes become available in base composition databases.

In some embodiments, the identity and quantity of an unknown bioagentmay be determined using the process illustrated in FIG. 3. Primers (500)and a known quantity of a calibration polynucleotide (505) are added toa sample containing nucleic acid of an unknown bioagent. The totalnucleic acid in the sample is then subjected to an amplificationreaction (510) to obtain amplicons. The molecular masses of ampliconsare determined (515) from which are obtained molecular mass andabundance data. The molecular mass of the bioagent identifying amplicon(520) provides for its identification (525) and the molecular mass ofthe calibration amplicon obtained from the calibration polynucleotide(530) provides for its quantification (535). The abundance data of thebioagent identifying amplicon is recorded (540) and the abundance datafor the calibration data is recorded (545), both of which are used in acalculation (550) which determines the quantity of unknown bioagent inthe sample.

In certain embodiments, a sample comprising an unknown bioagent iscontacted with a primer pair which amplifies the nucleic acid from thebioagent, and a known quantity of a polynucleotide that comprises acalibration sequence. The amplification reaction then produces twoamplicons: a bioagent identifying amplicon and a calibration amplicon.The bioagent identifying amplicon and the calibration amplicon aredistinguishable by molecular mass while being amplified at essentiallythe same rate.

Effecting differential molecular masses can be accomplished by choosingas a calibration sequence, a representative bioagent identifyingamplicon (from a specific species of bioagent) and performing, forexample, a 2-8 nucleobase deletion or insertion within the variableregion between the two priming sites. The amplified sample containingthe bioagent identifying amplicon and the calibration amplicon is thensubjected to molecular mass analysis by mass spectrometry, for example.The resulting molecular mass analysis of the nucleic acid of thebioagent and of the calibration sequence provides molecular mass dataand abundance data for the nucleic acid of the bioagent and of thecalibration sequence. The molecular mass data obtained for the nucleicacid of the bioagent enables identification of the unknown bioagent bybase composition analysis. The abundance data enables calculation of thequantity of the bioagent, based on the knowledge of the quantity ofcalibration polynucleotide contacted with the sample.

In some embodiments, construction of a standard curve in which theamount of calibration or calibrant polynucleotide spiked into the sampleis varied provides additional resolution and improved confidence for thedetermination of the quantity of bioagent in the sample. Alternatively,the calibration polynucleotide can be amplified in its own reactionvessel or vessels under the same conditions as the bioagent. A standardcurve may be prepared there from, and the relative abundance of thebioagent determined by methods such as linear regression. In someembodiments, multiplex amplification is performed where multiplebioagent identifying amplicons are amplified with multiple primer pairswhich also amplify the corresponding standard calibration sequences. Inthis or other embodiments, the standard calibration sequences areoptionally included within a single construct (preferably a vector)which functions as the calibration polynucleotide.

In some embodiments, the calibrant polynucleotide is used as an internalpositive control to confirm that amplification conditions and subsequentanalysis steps are successful in producing a measurable amplicon. Evenin the absence of copies of the genome of a bioagent, the calibrationpolynucleotide gives rise to a calibration amplicon. Failure to producea measurable calibration amplicon indicates a failure of amplificationor subsequent analysis step such as amplicon purification or molecularmass determination. Reaching a conclusion that such failures haveoccurred is, in itself, a useful event. In some embodiments, thecalibration sequence is comprised of DNA. In some embodiments, thecalibration sequence is comprised of RNA.

In some embodiments, a calibration sequence is inserted into a vectorwhich then functions as the calibration polynucleotide. In someembodiments, more than one calibration sequence is inserted into thevector that functions as the calibration polynucleotide. Such acalibration polynucleotide is herein termed a “combination calibrationpolynucleotide.” It should be recognized that the calibration methodshould not be limited to the embodiments described herein. Thecalibration method can be applied for determination of the quantity ofany bioagent identifying amplicon when an appropriate standard calibrantpolynucleotide sequence is designed and used.

In certain embodiments, primer pairs are configured to produce bioagentidentifying amplicons within more conserved regions of an HPV, whileothers produce bioagent identifying amplicons within regions that aremay evolve more quickly. Primer pairs that characterize amplicons in aconserved region with low probability that the region will evolve pastthe point of primer recognition are useful, e.g., as a broad rangesurvey-type primer. Primer pairs that characterize an ampliconcorresponding to an evolving genomic region are useful, e.g., fordistinguishing emerging bioagent strain variants.

The primer pairs described herein provide reagents, e.g., foridentifying diseases caused by emerging types of HPV. Base compositionanalysis eliminates the need for prior knowledge of bioagent sequence togenerate hybridization probes. Thus, in another embodiment, there isprovided a method for determining the etiology of a particular stainwhen the process of identification of is carried out in a clinicalsetting, and even when a new strain is involved. This is possiblebecause the methods may not be confounded by naturally occurringevolutionary variations.

Another embodiment provides a means of tracking the spread of anyspecies or strain of HPV when a plurality of samples obtained fromdifferent geographical locations are analyzed by methods described abovein an epidemiological setting. For example, a plurality of samples froma plurality of different locations may be analyzed with primers whichproduce bioagent identifying amplicons, a subset of which identifies aspecific strain. The corresponding locations of the members of thestrain-containing subset indicate the spread of the specific strain tothe corresponding locations.

Also provided are kits for carrying out the methods described herein. Insome embodiments, the kit may comprise a sufficient quantity of one ormore primer pairs to perform an amplification reaction on a targetpolynucleotide from a bioagent to form a bioagent identifying amplicon.In some embodiments, the kit may comprise from one to twenty primerpairs, from one to ten primer pairs, from one to eight pairs, from oneto five primer pairs, from one to three primer pairs, or from one to twoprimer pairs. In some embodiments, the kit may comprise one or moreprimer pairs recited in Tables 1 and 2. In certain embodiments, kitsinclude all of the primer pairs recited in Table 1, or Table 2, orTables 1 and 2.

In some embodiments, the kit may also comprise a sufficient quantity ofreverse transcriptase, a DNA polymerase, suitable nucleosidetriphosphates (including any of those described above), a DNA ligase,and/or reaction buffer, or any combination thereof, for theamplification processes described above. A kit may further includeinstructions pertinent for the particular embodiment of the kit, suchinstructions describing the primer pairs and amplification conditionsfor operation of the method. In some embodiments, the kit furthercomprises instructions for analysis, interpretation and dissemination ofdata acquired by the kit. In other embodiments, instructions for theoperation, analysis, interpretation and dissemination of the data of thekit are provided on computer readable media. A kit may also compriseamplification reaction containers such as microcentrifuge tubes,microtiter plates, and the like. A kit may also comprise reagents orother materials for isolating bioagent nucleic acid or bioagentidentifying amplicons from amplification reactions, including, forexample, detergents, solvents, or ion exchange resins which may belinked to magnetic beads. A kit may also comprise a table of measured orcalculated molecular masses and/or base compositions of bioagents usingthe primer pairs of the kit.

The invention also provides systems that can be used to perform variousassays relating to HPV detection or identification. In certainembodiments, systems include mass spectrometers configured to detectmolecular masses of amplicons produced using purified oligonucleotideprimer pairs described herein. Other detectors that are optionallyadapted for use in the systems of the invention are described furtherbelow. In some embodiments, systems also include controllers operablyconnected to mass spectrometers and/or other system components. In someof these embodiments, controllers are configured to correlate themolecular masses of the amplicons with bioagents to effect detection oridentification. In some embodiments, controllers are configured todetermine base compositions of the amplicons from the molecular massesof the amplicons. As described herein, the base compositions generallycorrespond to the HPV species identities. In certain embodiments,controllers include, or are operably connected to, databases of knownmolecular masses and/or known base compositions of amplicons of knownspecies of HPV produced with the primer pairs described herein.Controllers are described further below.

In some embodiments, systems include one or more of the primer pairsdescribed herein (e.g., in Tables 1 and 2). In certain embodiments, theoligonucleotides are arrayed on solid supports, whereas in others, theyare provided in one or more containers, e.g., for assays performed insolution. In certain embodiments, the systems also include at least onedetector or detection component (e.g., a spectrometer) that isconfigured to detect detectable signals produced in the container or onthe support. In addition, the systems also optionally include at leastone thermal modulator (e.g., a thermal cycling device) operablyconnected to the containers or solid supports to modulate temperature inthe containers or on the solid supports, and/or at least one fluidtransfer component (e.g., an automated pipettor) that transfers fluid toand/or from the containers or solid supports, e.g., for performing oneor more assays (e.g., nucleic acid amplification, real-time amplicondetection, etc.) in the containers or on the solid supports.

Detectors are typically structured to detect detectable signalsproduced, e.g., in or proximal to another component of the given assaysystem (e.g., in a container and/or on a solid support). Suitable signaldetectors that are optionally utilized, or adapted for use, hereindetect, e.g., fluorescence, phosphorescence, radioactivity, absorbance,refractive index, luminescence, or mass. Detectors optionally monitorone or a plurality of signals from upstream and/or downstream of theperformance of, e.g., a given assay step. For example, detectorsoptionally monitor a plurality of optical signals, which correspond inposition to “real-time” results. Example detectors or sensors includephotomultiplier tubes, CCD arrays, optical sensors, temperature sensors,pressure sensors, pH sensors, conductivity sensors, or scanningdetectors. Detectors are also described in, e.g., Skoog et al.,Principles of Instrumental Analysis, 5^(th) Ed., Harcourt Brace CollegePublishers (1998), Currell, Analytical Instrumentation: PerformanceCharacteristics and Quality, John Wiley & Sons, Inc. (2000), Sharma etal., Introduction to Fluorescence Spectroscopy, John Wiley & Sons, Inc.(1999), Valeur, Molecular Fluorescence: Principles and Applications,John Wiley & Sons, Inc. (2002), and Gore, Spectrophotometry andSpectrofluorimetry: A Practical Approach, 2.sup.nd Ed., OxfordUniversity Press (2000), which are each incorporated by reference.

As mentioned above, the systems of the invention also typically includecontrollers that are operably connected to one or more components (e.g.,detectors, databases, thermal modulators, fluid transfer components,robotic material handling devices, and the like) of the given system tocontrol operation of the components. More specifically, controllers aregenerally included either as separate or integral system components thatare utilized, e.g., to receive data from detectors (e.g., molecularmasses, etc.), to effect and/or regulate temperature in the containers,or to effect and/or regulate fluid flow to or from selected containers.Controllers and/or other system components are optionally coupled to anappropriately programmed processor, computer, digital device,information appliance, or other logic device (e.g., including an analogto digital or digital to analog converter as needed), which functions toinstruct the operation of these instruments in accordance withpreprogrammed or user input instructions, receive data and informationfrom these instruments, and interpret, manipulate and report thisinformation to the user. Suitable controllers are generally known in theart and are available from various commercial sources.

Any controller or computer optionally includes a monitor, which is oftena cathode ray tube (“CRT”) display, a flat panel display (e.g., activematrix liquid crystal display or liquid crystal display), or others.Computer circuitry is often placed in a box, which includes numerousintegrated circuit chips, such as a microprocessor, memory, interfacecircuits, and others. The box also optionally includes a hard diskdrive, a floppy disk drive, a high capacity removable drive such as awriteable CD-ROM, and other common peripheral elements. Inputtingdevices such as a keyboard or mouse optionally provide for input from auser. These components are illustrated further below.

The computer typically includes appropriate software for receiving userinstructions, either in the form of user input into a set of parameterfields, e.g., in a graphic user interface (GUI), or in the form ofpreprogrammed instructions, e.g., preprogrammed for a variety ofdifferent specific operations. The software then converts theseinstructions to appropriate language for instructing the operation ofone or more controllers to carry out the desired operation. The computerthen receives the data from, e.g., sensors/detectors included within thesystem, and interprets the data, either provides it in a user understoodformat, or uses that data to initiate further controller instructions,in accordance with the programming.

FIG. 4 is a schematic showing a representative system that includes alogic device in which various aspects of the present invention may beembodied. As will be understood by practitioners in the art from theteachings provided herein, aspects of the invention are optionallyimplemented in hardware and/or software. In some embodiments, differentaspects of the invention are implemented in either client-side logic orserver-side logic. As will be understood in the art, the invention orcomponents thereof may be embodied in a media program component (e.g., afixed media component) containing logic instructions and/or data that,when loaded into an appropriately configured computing device, causethat device to perform as desired. As will also be understood in theart, a fixed media containing logic instructions may be delivered to aviewer on a fixed media for physically loading into a viewer's computeror a fixed media containing logic instructions may reside on a remoteserver that a viewer accesses through a communication medium in order todownload a program component.

More specifically, FIG. 4 schematically illustrates computer 1000 towhich mass spectrometer 1002 (e.g., an ESI-TOF mass spectrometer, etc.),fluid transfer component 1004 (e.g., an automated mass spectrometersample injection needle or the like), and database 1008 are operablyconnected. Optionally, one or more of these components are operablyconnected to computer 1000 via a server (not shown in FIG. 4). Duringoperation, fluid transfer component 1004 typically transfers reactionmixtures or components thereof (e.g., aliquots comprising amplicons)from multi-well container 1006 to mass spectrometer 1002. Massspectrometer 1002 then detects molecular masses of the amplicons.Computer 1000 then typically receives this molecular mass data,calculates base compositions from this data, and compares it withentries in database 1008 to identify species or strains of HPV in agiven sample. It will be apparent to one of skill in the art that one ormore components of the system schematically depicted in FIG. 4 areoptionally fabricated integral with one another (e.g., in the samehousing).

While the present invention has been described with specificity inaccordance with certain of its embodiments, the following examples serveonly to illustrate the invention and are not intended to limit the same.In order that the invention disclosed herein may be more efficientlyunderstood, examples are provided below. It should be understood thatthese examples are for illustrative purposes only and are not to beconstrued as limiting the invention in any manner.

Example 1 High-Throughput ESI-Mass Spectrometry Assay for theIdentification of HPV

This example describes a HPV pathogen identification assay which employsmass spectrometry determined base compositions for PCR amplicons derivedfrom HPV. The T5000 Biosensor System is a mass spectrometry baseduniversal biosensor that uses mass measurements to derived basecompositions of PCR amplicons to identify bioagents including, forexample, bacteria, fungi, viruses and protozoa (S. A. Hofstadler et. al.Int. J. Mass Spectrom. (2005) 242:23-41, herein incorporated byreference). For this HPV assay primers from Tables 1 and 2 may beemployed to generate PCR amplicons. The base composition of the PCRamplicons can be determined and compared to a database of known HPV basecompositions to determine the identity of a HPV in a sample. Tables 1and 2 shows exemplary primers pairs for detecting alphapapillomavirus,betapapillomavirus, gammapapillomavirus, Mupapillomavirus, andNupapillomavirus. In Tables 1A and 2A, “I” represents inosine andTp=5-propynyluracil (also known as propynylated thymine).

TABLE 1A Primer Sequences Primer Pair Primer Number DirectionPrimer Sequence SEQ ID NO: 2534 Forward TAGGATGGTGATATGGTTGATACAGGCTTTGG1 2534 Reverse TCTGCAACCATTTGCAAATAATCTGGATATTT 9 2537 ForwardTTCAGATGTCTGTGTGGCGGCCTA 2 2537 Reverse TACATATTCATCCGTGCTTACAACCTTAGA10 2540 Forward TAGATGATAGTGACATTGCATATAAATATGCA 5 2540 ReverseTTTCTGCTCGTTTATAATGTCTACACAT 13 2544 Forward TGACGAACCACAGCGTCACA 6 2544Reverse TGCACACAACGGACACACAAA 14 2545 Forward TCGGGATGTAATGGCTGGTT 32545 Reverse TACCATGTCCGAACCTGTATCTGT 11 2546 ForwardTCAGGATGGTTTTTGGTAGAGGCTATAGT 4 2546 ReverseTGCCTGTGCTTCCAAGGAATTGTGTGTAATA  12 2547 ForwardTACACACAATTCCTTGGAAGCACAGGCA 7 2547 Reverse TTAGGTCCTGCACAGCCGCATAATG 152684 Forward TACTGTTATICAGGATGGTGATATGGT 8 2684 ReverseTCTGCAACCATTTGIAAATAATCTGGATATTT 16

TABLE 1B Primer Pair Names and Reference Amplicon Lengths PrimerReference Pair Amplicon Number Primer Pair Name Length 2534PAV_IMP_NC001526_6222-6355_2 134 2537 PaV_A9_NC001526_5632-5720 89 2540PaV_A9_NC001526_1972-2112 141 2544 PaV_A7_NC001357_748-895 148 2545PaV_A7_NC001357_947-1057 111 2546 PaV_A10_NC000904_875-1027 153 2547PaV_A10_NC000904_1000-1079 80 2684 PAV_IMP_MOD_NC001526_6212_6355_4 144

TABLE 1C Individual Primer Names and Hybridization Coordinates PrimerPair Primer Number Direction Individual Primer Names 2534 ForwardPAV_IMP_NC001526_6222_6253_F 2534 Reverse PAV_IMP_NC001526_6324_6355_R2537 Forward PAV_A9_NC001526_5632_5655_F 2537 ReversePAV_A9_NC001526_5691_5720_R 2540 Forward PAV_A9_NC001526_1972_2003_F2540 Reverse PAV_A9_NC001526_2085_2112_R 2544 ForwardPAV_A7_NC001357_748_767_F 2544 Reverse PAV_A7_NC001357_875_895_R 2545Forward PAV_A7_NC001357_947_966_F 2545 ReversePAV_A7_NC001357_1034_1057_R 2546 Forward PAV_A10_NC000904_875_903_F 2546Reverse PAV_A10_NC000904_997_1027_R 2547 ForwardPAV_A10_NC000904_1000_1027_F 2547 Reverse PAV_A10_NC000904_1055_1079_R2684 Forward PAV_IMP_MOD_NC001526_6212_6238_2_F 2684 ReversePAV_IMP_MOD_NC001526_6324_6355_2_R

TABLE 1D Gene Targets and GenBank gi Numbers Primer GenBank Pair TargetGenome Reference gi Number Segment HPV Strains Resolved Number 2534major capsid protein All Important HPVs 9627100 2537 major capsidprotein 16/31 and other A9 HPV 9627100 2540 replication protein 16/31and other A9 HPV 9627100 2544 E7 protein 18/45 and other A7 HPV 96260692545 E1 protein 18/45 and other A7 HPV 9626069 2546 E1 protein 6/11 andother A10 HPV 9633484 2547 E1 protein 6/11 and other A10 HPV 96334842684 major capsid protein All Important HPVs 9627100

TABLE 2A Primer Sequences Primer Pair Primer Number DirectionPrimer Sequence SEQ ID NO: 2670 Forward TAGGATGGTGATATGGTTGATACAGGCTTTGG17 2670 Reverse TCTGCAACCATTTGCAAATAATCTGGATATTTICA 44 2671 ForwardTAGGATGGTGATATGGTTGATACAGGCTTTGG 18 2671 ReverseTCTGCAACCATTTGIAAATAATCTGGATATTTICA 45 2672 ForwardTAGGATGGTGATATGGTTGATACAGGCTTTGG 19 2672 ReverseTCTGCAACCATTTIIAAATAATCTGGATATTTICA 46 2673 ForwardTAGGATGGTGATATGGTTGATACAGGCTTTGG 20 2673 ReverseTCTGCAACCATpTpTpGAAAATAATCTGGATATTT 47 2674 ForwardTAGGATGGTGATATGGTTGATACAGGCTTTGG 21 2674 ReverseTCTGCAACCATTTGAAAATAATCTGGATATTT 48 2675 ForwardTAGGATGGTGATATGGTTGATACAGGCTTTGG 22 2675 ReverseTCTGCAACCATTTGIAAATAATCTGGATATTT 49 2676 ForwardTAGGATGGTGATATGGTTGATACAGGCTTTGG 23 2676 ReverseTCTGCAACCATTTIIAAATAATCTGGATATTT 50 2677 ForwardTAGGATGGTGATATGGTTGATACAGGCTTTGG 24 2677 ReverseTCTGCAACCATTTIIAIATAATCTGGATATTT 51 2678 ForwardTAGGATGGTGATATGGTTGATACAGGCTITGG 25 2678 ReverseTCTGCAACCATTTGIAAATAATCTGGATATTT 52 2679 ForwardTAGGATGGTGATATGGTTGATACAGGITITGG 26 2679 ReverseTCTGCAACCATTTIIAAATAATCTGGATATTT 53 2680 ForwardTAGGATGGTGATATGGTTGATACIGGITITGG 27 2680 ReverseTCTGCAACCATTTIIAIATAATCTGGATATTT 54 2681 ForwardTACTGTTATTCAGGATGGTGATATGGT 28 2681 ReverseTCTGCAACCATTTGIAAATAATCTGGATATTT 55 2682 ForwardTACTGTTATTCAGGATGGTGATATGGT 29 2682 ReverseTCTGCAACCATTTIIAAATAATCTGGATATTT 56 2683 ForwardTACTGTTATTCAGGATGGTGATATGGT 30 2683 ReverseTCTGCAACCATTTIIAIATAATCTGGATATTT 57 2684 ForwardTACTGTTATICAGGATGGTGATATGGT 8 2684 ReverseTCTGCAACCATTTGIAAATAATCTGGATATTT 16 2685 ForwardTACTGTTATTCAGGATGGIGATATGGT 32 2685 ReverseTCTGCAACCATTTIIAAATAATCTGGATATTT 59 2686 ForwardTACTGTTATICAGGATGGIGATATGGT 33 2686 ReverseTCTGCAACCATTTIIAIATAATCTGGATATTT 60 2687 ForwardTTCAGATGTCTGTGTGGCIGCCTA 34 2687 Reverse TACATATTCATCCGTGCTTACAACCTTAGA61 2688 Forward TTCAGATGTCTITGTGGCIGCCTA 35 2688 ReverseTACATATTCATCCGTGCTTACAACCTTAGA 62 2689 ForwardTGGAAATCCTTTTTCTCAAGGACGTGGT 36 2689 ReverseTAGTATTTTGTCCTGCCACICATTTAAACG 63 2690 ForwardTGGAAATCCTTTTTCTCAAGGACGTGGT 37 2690 ReverseTAGTATTTTGTCCTGCCAIICATTTAAACG 64 2691 ForwardTGGAAATCCTTTTTCTCAAGGACGTGGT 38 2691 ReverseTAGTATTTTGTCCTGCCIIICATTTAAACG 65 2692 ForwardTAGATGATAGTGAIATIGCATATIAATATGCA 39 2692 ReverseTTTCTGCTCGTTTATAATGTCTACACAT 66 2693 Forward TATGGTGCAGTGGGCATTTGATAATG40 2693 Reverse TTGCTTTTTAAAAATGCAGIIGCATT 67 2694 ForwardTATGGTGCAGTGGGCATTTGATAATG 41 2694 Reverse TTGCTTTTTAAAAATGCIIIIGCATT 682695 Forward TATGGTGCAGTGGGCATITGATAATG 42 2695 ReverseTTGCTTTTTAAAAATGCAGIIGCATT 69 2696 Forward TATGGTGCAGTGGGCATTTGATAATG 432696 Reverse TATTTGCCTGCIIATTGCTITTTAAAAA 70

TABLE 2B Primer Pair Names and Reference Amplicon Lengths PrimerReference Pair Amplicon Number Primer Pair Name Length 2670PAV_IMP_MOD_NC001526_6222_6355 134 2671 PAV_IMP_MOD_NC001526_6222_6355_2134 2672 PAV_IMP_MOD_NC001526_6222_6355_3 134 2673PAV_IMP_MOD_NC001526_6222_6355_4P 134 2674PAV_IMP_MOD_NC001526_6222_6355_5 134 2675PAV_IMP_MOD_NC001526_6222_6355_6 134 2676PAV_IMP_MOD_NC001526_6222_6355_7 134 2677PAV_IMP_MOD_NC001526_6222_6355_8 134 2678PAV_IMP_MOD_NC001526_6222_6355_9 134 2679PAV_IMP_MOD_NC001526_6222_6355_10 134 2680PAV_IMP_MOD_NC001526_6222_6355_11 134 2681PAV_IMP_MOD_NC001526_6212_6355 144 2682 PAV_IMP_MOD_NC001526_6212_6355_2144 2683 PAV_IMP_MOD_NC001526_6212_6355_3 144 2684PaV_IMP_NC001526_6212-6355_4 144 2684 PAV_IMP_MOD_NC001526_6212_6355_4144 2685 PAV_IMP_MOD_NC001526_6212_6355_5 144 2686PAV_IMP_MOD_NC001526_6212_6355_6 144 2687 PAV_A9_MOD_NC001526_5632_572089 2688 PAV_A9_MOD_NC001526_5632_5720_2 89 2689PAV_A9_MOD_NC001526_2688_2802 115 2690 PAV_A9_MOD_NC001526_2688_2802_2115 2691 PAV_A9_MOD_NC001526_2688_2802_3 115 2692PAV_A9_MOD_NC001526_1972_2112 141 2693 PAV_A7_A10_NC000904_1912_2022 1112694 PAV_A7_A10_NC000904_1912_2022_2 111 2695PAV_A7_A10_NC000904_1912_2022_3 111 2696 PAV_A7_A10_NC000904_1912_2036125

TABLE 2C Individual Primer Names and Hybridization Coordinates PrimerPair Primer Number Direction Individual Primer Names 2670 ForwardPAV_IMP_MOD_NC001526_6222_6253_F 2670 ReversePAV_IMP_MOD_NC001526_6321_6355_R 2671 ForwardPAV_IMP_MOD_NC001526_6222_6253_F 2671 ReversePAV_IMP_MOD_NC001526_6321_6355_2_R 2672 ForwardPAV_IMP_MOD_NC001526_6222_6253_F 2672 ReversePAV_IMP_MOD_NC001526_6321_6355_3_R 2673 ForwardPAV_IMP_MOD_NC001526_6222_6253_F 2673 ReversePAV_IMP_MOD_NC001526_6324_6355P_R 2674 ForwardPAV_IMP_MOD_NC001526_6222_6253_F 2674 ReversePAV_IMP_MOD_NC001526_6324_6355_R 2675 ForwardPAV_IMP_MOD_NC001526_6222_6253_F 2675 ReversePAV_IMP_MOD_NC001526_6324_6355_2_R 2676 ForwardPAV_IMP_MOD_NC001526_6222_6253_F 2676 ReversePAV_IMP_MOD_NC001526_6324_6355_3_R 2677 ForwardPAV_IMP_MOD_NC001526_6222_6253_F 2677 ReversePAV_IMP_MOD_NC001526_6324_6355_4_R 2678 ForwardPAV_IMP_MOD_NC001526_6222_6253_2_F 2678 ReversePAV_IMP_MOD_NC001526_6324_6355_2_R 2679 ForwardPAV_IMP_MOD_NC001526_6222_6253_3_F 2679 ReversePAV_IMP_MOD_NC001526_6324_6355_3_R 2680 ForwardPAV_IMP_MOD_NC001526_6222_6253_4_F 2680 ReversePAV_IMP_MOD_NC001526_6324_6355_4_R 2681 ForwardPAV_IMP_MOD_NC001526_6212_6238_F 2681 ReversePAV_IMP_MOD_NC001526_6324_6355_2_R 2682 ForwardPAV_IMP_MOD_NC001526_6212_6238_F 2682 ReversePAV_IMP_MOD_NC001526_6324_6355_3_R 2683 ForwardPAV_IMP_MOD_NC001526_6212_6238_F 2683 ReversePAV_IMP_MOD_NC001526_6324_6355_4_R 2684 ForwardPAV_IMP_MOD_NC001526_6212_6238_2_F 2684 ReversePAV_IMP_MOD_NC001526_6324_6355_2_R 2685 ForwardPAV_IMP_MOD_NC001526_6212_6238_3_F 2685 ReversePAV_IMP_MOD_NC001526_6324_6355_3_R 2686 ForwardPAV_IMP_MOD_NC001526_6212_6238_4_F 2686 ReversePAV_IMP_MOD_NC001526_6324_6355_4_R 2687 ForwardPAV_A9_MOD_NC001526_5632_5655_F 2687 ReversePAV_A9_MOD_NC001526_5691_5720_R 2688 ForwardPAV_A9_MOD_NC001526_5632_5655_2_F 2688 ReversePAV_A9_MOD_NC001526_5691_5720_R 2689 ForwardPAV_A9_MOD_NC001526_2688_2715_F 2689 ReversePAV_A9_MOD_NC001526_2773_2802_R 2690 ForwardPAV_A9_MOD_NC001526_2688_2715_F 2690 ReversePAV_A9_MOD_NC001526_2773_2802_2_R 2691 ForwardPAV_A9_MOD_NC001526_2688_2715_F 2691 ReversePAV_A9_MOD_NC001526_2773_2802_3_R 2692 ForwardPAV_A9_MOD_NC001526_1972_2003_F 2692 ReversePAV_A9_MOD_NC001526_2085_2112_R 2693 ForwardPAV_A7_A10_NC000904_1912_1937_F 2693 ReversePAV_A7_A10_NC000904_1997_2022_R 2694 ForwardPAV_A7_A10_NC000904_1912_1937_F 2694 ReversePAV_A7_A10_NC000904_1997_2022_2_R 2695 ForwardPAV_A7_A10_NC000904_1912_1937_2_F 2695 ReversePAV_A7_A10_NC000904_1997_2022_R 2696 ForwardPAV_A7_A10_NC000904_1912_1937_F 2696 ReversePAV_A7_A10_NC000904_2009_2036_R

TABLE 1D Gene Targets and GenBank gi Numbers Primer GenBank Pair TargetGenome Reference gi Number Segment HPV Strains Resolved Number 2534major capsid protein All 9627100 2537 major capsid protein All 96271002540 replication protein All 9627100 2544 E7 protein All 9626069 2545 E1protein All 9626069 2546 E1 protein All 9633484 2547 E1 protein All9633484 2670 major capsid protein All 9627100 2671 major capsid proteinAll 9627100 2672 major capsid protein All 9627100 2673 major capsidprotein All 9627100 2674 major capsid protein All 9627100 2675 majorcapsid protein All 9627100 2676 major capsid protein All 9627100 2677major capsid protein All 9627100 2678 major capsid protein All 96271002679 major capsid protein All 9627100 2680 major capsid protein 16/319627100 2681 major capsid protein 16/31 9627100 2682 major capsidprotein 16/31 9627100 2683 major capsid protein 16/31 9627100 2684 majorcapsid protein 16/31 9627100 2684 major capsid protein 16/31 96271002685 major capsid protein 18/45 + 6/11 9627100 2686 major capsid protein18/45 + 6/12 9627100 2687 major capsid protein 18/45 + 6/11 9627100 2688replication protein 18/45 + 6/12 9627100 2689 replication protein All9627100

It is noted that the primer pairs in Tables 1 and 2 could be combinedinto a single panel for detection one or more HPV (e.g., multiple typesof HPV). The primers and primer pairs of Tables 1 and 2 could be used,for example, to detect human and animal infections. These primers andprimer pairs may also be grouped (e.g., in panels or kits) for multiplexdetection of other bioagents such as flavivirus, alphavirus, adenovirus,and other bioagents. In particular embodiments, the primers are used inassays for testing product safety.

Example 2 De Novo Determination of Base Composition of Amplicons UsingMolecular Mass Modified Deoxynucleotide Triphosphates

Because the molecular masses of the four natural nucleobases fall withina narrow molecular mass range (A=313.058, G=329.052, C=289.046,T=304.046, values in Daltons—See, Table 3), a source of ambiguity inassignment of base composition may occur as follows: two nucleic acidstrands having different base composition may have a difference of about1 Da when the base composition difference between the two strands is G

A (−15.994) combined with C

T (+15.000). For example, one 99-mer nucleic acid strand having a basecomposition of A₂₇G₃₀C₂₁T₂₁ has a theoretical molecular mass of30779.058 while another 99-mer nucleic acid strand having a basecomposition of A₂₆G₃₁C₂₂T₂₀ has a theoretical molecular mass of30780.052 is a molecular mass difference of only 0.994 Da. A 1 Dadifference in molecular mass may be within the experimental error of amolecular mass measurement and thus, the relatively narrow molecularmass range of the four natural nucleobases imposes an uncertainty factorin this type of situation. One method for removing this theoretical 1 Dauncertainty factor uses amplification of a nucleic acid with onemass-tagged nucleobase and three natural nucleobases.

Addition of significant mass to one of the 4 nucleobases (dNTPs) in anamplification reaction, or in the primers themselves, will result in asignificant difference in mass of the resulting amplicon (greater than 1Da) arising from ambiguities such as the G

A combined with C

T event (Table 3). Thus, the same G

A (−15.994) event combined with 5-Iodo-C

T (−110.900) event would result in a molecular mass difference of126.894 Da. The molecular mass of the base compositionA₂₇G₃₀5-Iodo-C₂₁T₂₁ (33422.958) compared with A₂₆G₃₁5-Iodo-C₂₂T₂₀,(33549.852) provides a theoretical molecular mass difference is+126.894. The experimental error of a molecular mass measurement is notsignificant with regard to this molecular mass difference. Furthermore,the only base composition consistent with a measured molecular mass ofthe 99-mer nucleic acid is A₂₇G₃₀5-Iodo-C₂₁T₂₁. In contrast, theanalogous amplification without the mass tag has 18 possible basecompositions.

TABLE 3 Molecular Masses of Natural Nucleobases and the Mass-ModifiedNucleobase 5-Iodo-C and Molecular Mass Differences Resulting fromTransitions Nucleobase Molecular Mass Transition Δ Molecular Mass A313.058 A-->T −9.012 A 313.058 A-->C −24.012 A 313.058 A-->5-Iodo-C101.888 A 313.058 A-->G 15.994 T 304.046 T-->A 9.012 T 304.046 T-->C−15.000 T 304.046 T-->5-Iodo-C 110.900 T 304.046 T-->G 25.006 C 289.046C-->A 24.012 C 289.046 C-->T 15.000 C 289.046 C-->G 40.006 5-Iodo-C414.946 5-Iodo-C-->A −101.888 5-Iodo-C 414.946 5-Iodo-C-->T −110.9005-Iodo-C 414.946 5-Iodo-C-->G −85.894 G 329.052 G-->A −15.994 G 329.052G-->T −25.006 G 329.052 G-->C −40.006 G 329.052 G-->5-Iodo-C 85.894

Mass spectra of bioagent-identifying amplicons may be analyzed using amaximum-likelihood processor, as is widely used in radar signalprocessing. This processor first makes maximum likelihood estimates ofthe input to the mass spectrometer for each primer by running matchedfilters for each base composition aggregate on the input data. Thisincludes the response to a calibrant for each primer.

The algorithm emphasizes performance predictions culminating inprobability-of-detection versus probability-of-false-detection plots forconditions involving complex backgrounds of naturally occurringorganisms and environmental contaminants. Matched filters consist of apriori expectations of signal values given the set of primers used foreach of the bioagents. A genomic sequence database is used to define themass base count matched filters. The database contains the sequences ofknown bioagents (e.g., types of HPV) and includes threat organisms aswell as benign background organisms. The latter is used to estimate andsubtract the spectral signature produced by the background organisms. Amaximum likelihood detection of known background organisms isimplemented using matched filters and a running-sum estimate of thenoise covariance. Background signal strengths are estimated and usedalong with the matched filters to form signatures which are thensubtracted. The maximum likelihood process is applied to this “cleanedup” data in a similar manner employing matched filters for the organismsand a running-sum estimate of the noise-covariance for the cleaned updata.

The amplitudes of all base compositions of bioagent-identifyingamplicons for each primer are calibrated and a final maximum likelihoodamplitude estimate per organism is made based upon the multiple singleprimer estimates. Models of system noise are factored into thistwo-stage maximum likelihood calculation. The processor reports thenumber of molecules of each base composition contained in the spectra.The quantity of amplicon corresponding to the appropriate primer set isreported as well as the quantities of primers remaining upon completionof the amplification reaction.

Base count blurring may be carried out as follows. Electronic PCR can beconducted on nucleotide sequences of the desired bioagents to obtain thedifferent expected base counts that could be obtained for each primerpair. See for example, Schuler, Genome Res. 7:541-50, 1997; or the e-PCRprogram available from National Center for Biotechnology Information(NCBI, NIH, Bethesda, Md.). In one embodiment one or more spreadsheetsfrom a workbook comprising a plurality of spreadsheets may be used(e.g., Microsoft Excel). First, in this example, there is a worksheetwith a name similar to the workbook name; this worksheet contains theraw electronic PCR data. Second, there is a worksheet named “filteredbioagents base count” that contains bioagent name and base count; thereis a separate record for each strain after removing sequences that arenot identified with a genus and species and removing all sequences forbioagents with less than 10 strains. Third, there is a worksheet,“Sheet1” that contains the frequency of substitutions, insertions, ordeletions for this primer pair. This data is generated by first creatinga pivot table from the data in the “filtered bioagents base count”worksheet and then executing an Excel VBA macro. The macro creates atable of differences in base counts for bioagents of the same species,but different strains.

Application of an exemplary script, involves the user defining athreshold that specifies the fraction of the strains that arerepresented by the reference set of base counts for each bioagent. Thereference set of base counts for each bioagent may contain as manydifferent base counts as are needed to meet or exceed the threshold. Theset of reference base counts is defined by selecting the most abundantstrain's base type composition and adding it to the reference set, andthen the next most abundant strain's base type composition is addeduntil the threshold is met or exceeded.

For each base count not included in the reference base count set for thebioagent of interest, the script then proceeds to determine the mannerin which the current base count differs from each of the base counts inthe reference set. This difference may be represented as a combinationof substitutions, Si=Xi, and insertions, Ii=Yi, or deletions, Di=Zi. Ifthere is more than one reference base count, then the reporteddifference is chosen using rules that aim to minimize the number ofchanges and, in instances with the same number of changes, minimize thenumber of insertions or deletions. Therefore, the primary rule is toidentify the difference with the minimum sum (Xi+Yi) or (Xi+Zi), e.g.,one insertion rather than two substitutions. If there are two or moredifferences with the minimum sum, then the one that will be reported isthe one that contains the most substitutions.

Differences between a base count and a reference composition arecategorized as one, two, or more substitutions, one, two, or moreinsertions, one, two, or more deletions, and combinations ofsubstitutions and insertions or deletions. The different classes ofnucleobase changes and their probabilities of occurrence have beendelineated in U.S. Patent Application Publication No. 2004209260 (U.S.application Ser. No. 10/418,514) which is incorporated herein byreference in entirety.

Example 3 Validation of Primer Pairs

The primer pairs were tested against a panel of papillomavirusesobtained from ATCC. The following viruses were obtained as full-lengthplasmid clones: ATCC 45150D (HPV-6b); ATCC 45151D (HPV-11); ATCC 45152D(HPV-18); and ATCC 45113D (HPV-16). The broad primer pair number 2534amplified all four viruses tested at two different dilutions of theplasmids. A series of primer modifications, including, for example,inosine substitutions to overcome potential sequence mismatches wereintroduced into the forward and reverse primer pairs. Most of themodified primers tested showed improved performance across the testisolates. In addition to the primers broadly targeting the majorspecies, a series of primers targeting papillomavirus groups, A7, A9 andA10 that account for over 30 different papillomaviruses were alsotested. Table 2 provides the primer pairs used for papillomavirusidentification and indicates isolates tested, target virus groups andmajor species covered.

TABLE 4 Primer Pairs Targeting Human Papillomaviruses Primer Pair TargetVirus Major Species Number Isolates Tested Group Covered 2537 HPV-16Group A9 HPV-16, HPV-31, 2540 HPV-33, HPV-35, HPV-52, HPV-58, HPV-67,and RhPV 2544 HPV-18 Group A7 HPV-18, HPV-39, 2545 HPV-45, HPV-59,HPV-68, and HPV-70 2546 HPV-6, HPV-11 Group A10 HPV-6, HPV-11, HPV-13,HPV-44, HPV-55, and PCPV

For additional testing and validation, two different HeLa cell linesinfected with HPV-18 were obtained from ATCC(CCL-2 and CCL-2.2). Thesewere tested at limiting dilutions using a subset of the primers testedabove. Results are shown below. The primer pairs used for this testincluded the major human PaV primer pair 2685, the Group A7 targetedprimers 2544 and 2545 and the Group A10 primer 2546.

In addition to testing the performance of the primers on the cell lines,plasmid DNA containing HPV-6b was spiked into the CCL-2 cell line todetermine the dynamic range of detection of the two viruses, cell linederived HPV-18 and the plasmid-derived HPV-6b, simultaneously, In allthe tests done, the broad primers as well as the Group A7 primers showeddetection of HPV-18 in both cell lines at input levels between 1-10cells per well. At an estimated copy number of approximately 20 HPV-18genomes per cell, this corresponds to detection sensitivities between20-200 genomes from cell lines containing papillomavirus sequences. Inexperiments done with a co-spike of HPV-6b plasmid into these celllines, the detection ranges were comparable. HPV-6b was spiked in at twodifferent, fixed concentrations of 200 copies and 2000 copies per welland amplified with the broad primer pair number 2534. Simultaneousdetection of HPV-6b and HPV-18 was observed when the plasmid DNA wasspiked in at 2000 copies into a range of CCL-2 cell concentration from1000 to 0 per well. HPV-18 was detected in all wells with the exceptionof the lowest input level (10 cells/well), in the presence of 2000copies of HPV-6b. HPV-6b (2000 copies) was detected in the presence ofHeLa cell loads up to 600 cells/well, with an effective HPV-18concentration of approximately 12000 genomes/well. In anotherexperiment, a plasmid spike of approximately 200 copies per well wasused. In this case, HPV-18 was detected at all test concentrations,including the lowest cell concentration of 10 cells per well. Thedynamic range for detection of the two viruses simultaneously is between5-10 fold at the lower and higher ends, giving an overall dynamic rangeof approximately 25 fold for the detection of competing templates in thepresence of each other. These experiments indicate that two or moreviruses can be simultaneously detected using the same assay.

Example 4 Testing of Primer Pairs Against Strains of HumanPapillomaviruses

A series of human papillomavirus samples were tested using the panel ofprimer pairs listed in Table 1 (primer pair numbers 2534, 2537, 2545,2546, 2540, 2544, 2547 and 2684. The results are shown in Tables 5A and5B and include experimentally determined base compositions. Strains ofhuman papillomavirus identified are shown in the “Results” column. Inmost cases, the experimentally-determined base compositions matched thebase compositions of strains of human papillomaviruses stored in a basecomposition database.

TABLE 5A Base Composition Results for Primer Pair Numbers 2534, 2537,2545 and 2546 Primer Pair Primer Pair Primer Pair Primer Pair Sample IDResult 2534 2537 2545 2546 HPV13299 HPV type 74 38 30 22 44 NoDetectNoDetect 47 48 21 37 HPV13308 HPV type 44 41 30 19 44 Redo NoDetect 4749 22 35 HPV15465 HPV type 40 36 33 18 47 NoDetect NoDetect NoDetectHPV16837 HPV type 61 39 33 19 43 NoDetect NoDetect 47 50 21 35 HPV17259HPV type 6 39 31 18 46 NoDetect NoDetect 49 44 25 35 HPV0137 No matches35 33 18 48 NoDetect NoDetect NoDetect HPV0138 HPV type 40 36 33 18 4718 26 19 26 NoDetect Redo HPV0139 HPV type 6 41 30 20 43 + 20 21 19 29unknown bc 49 44 25 35 39 31 18 46 HPV0140 HPV type 6 39 31 18 46NoDetect NoDetect 49 44 25 35 HPV0141 HPV type 51 34 32 25 43 NoDetectNoDetect NoDetect HPV0142 Negative Redo NoDetect NoDetect Redo BlankNegative NoDetect IP NoDetect NoDetect HPV17436 HPV cand 62 39 34 20 41NoDetect Redo NoDetect (Qv18091) HPV17682 Negative NoDetect NoDetectRedo NoDetect HPV17741 HPV type 6 39 31 18 46 NoDetect NoDetect 49 44 2535 HPV18249 HPV type 11 38 31 20 45 NoDetect NoDetect 49 46 23 35 (SNP)HPV18287 HPV type 38 32 16 48 20 23 20 26 36 36 16 23 49 44 25 35 cand85HPV19189 Negative NoDetect NoDetect NoDetect Redo HPV19267 NegativeNoDetect NoDetect NoDetect NoDetect HPV19385 Negative NoDetect NoDetectNoDetect NoDetect HPV19463 HPV type 33 40 33 15 46 19 25 19 26 RedoNoDetect HPV19479 HPV type 52 Redo 17 25 20 27 Redo NoDetect HPV19486HPV type 70 40 34 21 39 IP 37 34 19 21 Redo HPV19802 HPV type 45 39 3418 43 NoDetect 40 32 12 27 NoDetect HPV19863 HPV type NoDetect 20 24 2025 Redo NoDetect 35/35H HPV19918 No matches NoDetect NoDetect RedoNoDetect HPV19955 HPV type 39 38 35 17 44 NoDetect 38 34 16 23 NoDetectHPV20010 No matches 35 36 19 44 NoDetect NoDetect NoDetect HPV20027 HPVtype 52 39 34 15 46 17 25 20 27 NoDetect NoDetect HPV20144 HPV type 16NoDetect 18 24 20 27 NoDetect NoDetect HPV20152 HPV type 53 41 30 18 45Redo NoDetect NoDetect HPV20215 HPV type 71 38 33 21 42 NoDetectNoDetect NoDetect HPV20274 Negative NoDetect NoDetect Redo NoDetectHPV20289 No matches Redo NoDetect NoDetect NoDetect HPV20300 NegativeNoDetect NoDetect NoDetect NoDetect HPV20341 HPV type 31 39 29 16 50 1925 20 25 NoDetect NoDetect (SNP) HPV20361 Negative NoDetect NoDetectNoDetect NoDetect HPV20370 HPV type 18 38 33 16 47 NoDetect 40 30 14 27NoDetect HPV20384 No matches 37 35 18 44 18 26 19 26 NoDetect NoDetectHPV20387 Negative NoDetect NoDetect NoDetect NoDetect HPV20441 NegativeNoDetect NoDetect NoDetect NoDetect HPV20507 Negative NoDetect NoDetectRedo NoDetect HPV20521 HPV type 59 40 33 15 46 NoDetect NoDetectNoDetect HPV0137 HPV type 36 33 18 47 NoDetect NoDetect 51 43 22 3740/74 HPV0138 No matches 35 33 18 48 NoDetect NoDetect NoDetect HPV0139HPV type 6 39 31 18 46 NoDetect NoDetect 49 44 25 35 HPV0140 HPV type 639 31 18 46 20 21 19 29 NoDetect 49 44 25 35 HPV0141 Negative NoDetectNoDetect NoDetect NoDetect HPV0142 Negative NoDetect NoDetect NoDetectNoDetect HPV0143 HPV type 55 NoDetect NoDetect NoDetect 47 48 21 37HPV0144 Negative NoDetect NoDetect Redo NoDetect HPV0145 NegativeNoDetect NoDetect NoDetect NoDetect HPV0146 HPV type 59 NoDetectNoDetect NoDetect NoDetect HPV0147 No matches 43 30 17 44 20 25 18 26NoDetect NoDetect HPV0148 No matches 36 34 23 41 + 20 25 18 26 NoDetectNoDetect 39 34 24 37 HPV0149 HPV type 67 NoDetect 17 26 19 27 NoDetectNoDetect HPV0150 No matches NoDetect 18 26 19 26 NoDetect NoDetectHPV0151 HPV type 74 38 30 20 46 NoDetect NoDetect 51 43 22 37 HPV0152HPV type 74 38 30 20 46 NoDetect NoDetect 51 43 22 37 HPV0153 HPV type 6NoDetect NoDetect NoDetect 49 44 25 35 HPV0154 HPV type 72 39 30 24 41NoDetect Redo NoDetect HPV0155 Negative Redo NoDetect NoDetect NoDetectHPV0156 No matches NoDetect 18 26 19 26 NoDetect NoDetect HPV0157Negative NoDetect NoDetect NoDetect NoDetect HPV0158 Negative NoDetectRedo Redo NoDetect HPV0159 Negative NoDetect NoDetect NoDetect NoDetectHPV0160 Negative NoDetect NoDetect NoDetect NoDetect HPV0161 HPV type 639 31 18 46 NoDetect NoDetect NoDetect HPV0162 Negative NoDetectNoDetect NoDetect NoDetect HPV0163 Negative Redo NoDetect Redo NoDetectHPV0164 No matches 35 32 25 42 NoDetect NoDetect NoDetect HPV0165 Nomatches 37 34 22 41 NoDetect NoDetect NoDetect HPV0166 Negative NoDetectNoDetect NoDetect NoDetect HPV0167 No matches Redo NoDetect RedoNoDetect HPV0168 Negative NoDetect NoDetect Redo NoDetect HPV0169 HPVtype 74 38 30 22 44 NoDetect NoDetect NoDetect HPV0170 HPV type 74 38 3020 46 NoDetect Redo NoDetect HPV0171 HPV type 44 NoDetect 19 25 19 26NoDetect 47 49 22 35 (~25 copies) HPV0172 HPV type 44 41 30 19 44NoDetect NoDetect 47 49 22 35 HPV179 HPV type 54 NoDetect 22 23 20 24NoDetect NoDetect HPV180 HPV type 54 41 30 22 41 24 21 20 24 Redo RedoHPV181 No matches NoDetect 18 26 19 26 NoDetect NoDetect HPV182 HPV type44 41 30 19 44 NoDetect Redo 47 49 22 35 HPV183 HPV type 40 33 15 46NoDetect NoDetect NoDetect 59/33 HPV184 HPV type 30 42 28 20 44 NoDetectNoDetect NoDetect HPV185 HPV type 16 43 30 17 44 + 18 24 20 27 NoDetectRedo 39 34 15 46 HPV186 No matches 36 34 23 41 20 25 18 26 NoDetect RedoHPV187 No matches NoDetect 18 26 19 26 NoDetect NoDetect HPV188 HPV type37 31 20 46 18 26 19 26 NoDetect NoDetect 11, 13, 74, 44, 55, HPV189 Nomatches NoDetect NoDetect NoDetect NoDetect HPV190 Negative NoDetectNoDetect NoDetect NoDetect HPV191 HPV type 74 38 30 20 46 NoDetect Redo51 43 22 37 HPV192 Negative NoDetect NoDetect NoDetect NoDetect HPV193Negative NoDetect NoDetect NoDetect NoDetect HPV194 Negative NoDetectNoDetect NoDetect NoDetect HPV195 Negative NoDetect NoDetect RedoNoDetect HPV196 HPV type 6 39 31 18 46 NoDetect Redo 49 44 25 35 HPV197HPV type 72 39 30 24 41 NoDetect Redo NoDetect HPV198 HPV type 70NoDetect 19 24 19 27 Redo 47 50 21 35 HPV199 HPV type 44 41 30 19 44 1924 19 27 Redo 47 49 22 35 HPV200 Negative NoDetect NoDetect RedoNoDetect HPV201 No matches NoDetect NoDetect Redo Redo HPV202 HPV type 6Redo NoDetect Redo 49 44 25 35 HPV203 Negative Redo NoDetect RedoNoDetect HPV204 Negative NoDetect NoDetect Redo NoDetect HPV205 HPV type39 35 32 25 NoDetect NoDetect NoDetect 42; 38 35 17 44 HPV206 NegativeNoDetect NoDetect Redo Redo HPV250425 HPV type 39 30 18 47 NoDetectNoDetect NoDetect 6vc HPV261931 HPV type 55 NoDetect NoDetect NoDetect47 48 21 37 HPV340160 HPV type 84 40 30 21 43 19 25 21 24 NoDetect RedoHPV397645 HPV type NoDetect 19 25 20 25 NoDetect NoDetect cand87HPV403876 HPV type 38 32 16 48 NoDetect 36 36 16 23 NoDetect cand85HPV525736 HPV type 42 NoDetect 20 21 19 29 NoDetect NoDetect HPV678087HPV type 70 39 34 21 40 NoDetect 37 34 19 21 Redo HPV683500 HPV type 4036 33 18 47 NoDetect NoDetect NoDetect HPV711336 HPV type 44 41 30 19 44NoDetect NoDetect 47 50 21 35 HPV766300 HPV type 56 37 35 20 42 23 22 1727 NoDetect Redo HPV781687 HPV type 72 39 30 24 41 NoDetect NoDetectNoDetect HPV857673 HPV type 44 43 29 17 45 NoDetect NoDetect 47 49 22 35HPV901338 Negative NoDetect NoDetect NoDetect NoDetect HPV901338Negative Redo NoDetect NoDetect NoDetect HPV922829 Negative NoDetectNoDetect NoDetect NoDetect HPV922829 Negative NoDetect NoDetect RedoNoDetect HPV932724 HPV type 54 41 30 22 41 24 21 20 24 NoDetect 47 49 2235 HPV999950 HPV type 67 NoDetect 17 26 19 27 NoDetect NoDetect

TABLE 5B Base Composition Results for Primer Pair Numbers 2540, 2544,2547 and 2684 Primer Pair Primer Pair Primer Pair Primer Pair Sample IDResult 2540 2544 2547 2684 HPV13299 HPV type 74 Redo NoDetect 24 21 1817 41 31 25 47 HPV13308 HPV type 44 NoDetect NoDetect 23 22 18 17 42 3222 48 HPV15465 HPV type 40 NoDetect NoDetect Redo 37 35 21 51 HPV16837HPV type 61 NoDetect 40 38 32 38 Redo NoDetect HPV17259 HPV type 6NoDetect NoDetect 23 21 20 16 41 32 22 49 HPV0137 No matches NoDetectNoDetect NoDetect 36 35 21 52 HPV0138 HPV type 40 NoDetect 40 38 32 38Redo 37 35 21 51 HPV0139 HPV type 6 NoDetect NoDetect 23 21 20 16 44 3222 46 + 41 32 22 49 HPV0140 HPV type 6 NoDetect NoDetect 23 21 20 16 4132 22 49 HPV0141 HPV type 51 NoDetect NoDetect NoDetect 36 34 27 47HPV0142 Negative Redo NoDetect NoDetect NoDetect Blank Negative NoDetectNoDetect Redo NoDetect HPV17436 HPV cand 62 NoDetect NoDetect NoDetect43 36 23 42 (Qv18091) HPV17682 Negative NoDetect NoDetect NoDetectNoDetect HPV17741 HPV type 6 Redo NoDetect 23 21 20 16 41 32 22 49 + 4432 22 46 HPV18249 HPV type 11 NoDetect 39 37 34 38 22 22 17 19 40 32 2349 HPV18287 HPV type 54 28 16 43 NoDetect 23 21 20 16 NoDetect cand85HPV19189 Negative NoDetect NoDetect NoDetect NoDetect HPV19267 NegativeRedo NoDetect NoDetect NoDetect HPV19385 Negative Redo NoDetect RedoNoDetect HPV19463 HPV type 33 60 26 18 37 NoDetect Redo 43 35 18 48HPV19479 HPV type 52 NoDetect NoDetect Redo NoDetect HPV19486 HPV type70 NoDetect 40 35 31 42 Redo NoDetect HPV19802 HPV type 45 NoDetect 3640 29 43 Redo 42 36 20 46 HPV19863 HPV type 59 26 16 40 NoDetect RedoNoDetect 35/35H HPV19918 No matches NoDetect 41 37 32 38 Redo RedoHPV19955 HPV type 39 Redo 39 37 34 38 Redo NoDetect HPV20010 No matchesNoDetect NoDetect Redo 39 38 22 45 HPV20027 HPV type 52 59 28 22 32NoDetect Redo NoDetect HPV20144 HPV type 16 59 27 19 36 NoDetect RedoNoDetect HPV20152 HPV type 53 NoDetect NoDetect Redo NoDetect HPV20215HPV type 71 NoDetect NoDetect NoDetect 41 34 24 45 HPV20274 NegativeNoDetect NoDetect Redo NoDetect HPV20289 No matches NoDetect NoDetectRedo 36 34 26 48 HPV20300 Negative NoDetect NoDetect NoDetect NoDetectHPV20341 HPV type 31 58 29 15 39 NoDetect Redo 41 30 19 54 HPV20361Negative NoDetect NoDetect NoDetect NoDetect HPV20370 HPV type 18NoDetect 36 38 33 41 NoDetect 42 34 18 50 HPV20384 No matches 56 30 2035 NoDetect Redo 39 39 21 45 HPV20387 Negative NoDetect NoDetectNoDetect NoDetect HPV20441 Negative NoDetect NoDetect NoDetect NoDetectHPV20507 Negative NoDetect NoDetect NoDetect NoDetect HPV20521 HPV type59 57 27 20 37 38 36 31 43 Redo 45 34 18 47 HPV0137 HPV type NoDetectNoDetect 23 22 18 17 37 35 21 40/74 51; 41 31 23 49 HPV0138 No matchesRedo NoDetect NoDetect 47 33 23 41 HPV0139 HPV type 6 NoDetect NoDetectRedo NoDetect HPV0140 HPV type 6 NoDetect NoDetect Redo 44 32 22 46HPV0141 Negative NoDetect NoDetect NoDetect NoDetect HPV0142 NegativeNoDetect NoDetect Redo NoDetect HPV0143 HPV type 55 NoDetect NoDetect 2322 18 17 42 31 25 46 HPV0144 Negative NoDetect NoDetect Redo NoDetectHPV0145 Negative NoDetect NoDetect Redo NoDetect HPV0146 HPV type 59 5727 20 37 37 37 31 43 Redo 45 34 18 47 HPV0147 No matches NoDetectNoDetect NoDetect 43 33 19 49 HPV0148 No matches NoDetect NoDetectNoDetect 40 36 26 42/35 38 26 45 HPV0149 HPV type 67 56 30 19 36NoDetect Redo NoDetect HPV0150 No matches NoDetect NoDetect RedoNoDetect HPV0151 HPV type 74 NoDetect NoDetect 23 22 18 17 41 31 23 49HPV0152 HPV type 74 NoDetect NoDetect 23 22 18 17 41 31 23 49 HPV0153HPV type 6 NoDetect NoDetect 23 21 20 16 NoDetect HPV0154 HPV type 72NoDetect NoDetect NoDetect NoDetect HPV0155 Negative NoDetect NoDetectRedo NoDetect HPV0156 No matches NoDetect NoDetect Redo NoDetect HPV0157Negative Redo NoDetect Redo NoDetect HPV0158 Negative Redo NoDetect RedoNoDetect HPV0159 Negative NoDetect NoDetect Redo NoDetect HPV0160Negative Redo NoDetect NoDetect NoDetect HPV0161 HPV type 6 NoDetectNoDetect Redo NoDetect HPV0162 Negative Redo NoDetect NoDetect NoDetectHPV0163 Negative NoDetect NoDetect Redo NoDetect HPV0164 No matches RedoNoDetect Redo 39 34 28 43 HPV0165 No matches Redo NoDetect Redo NoDetectHPV0166 Negative Redo NoDetect NoDetect NoDetect HPV0167 No matchesNoDetect 38 40 31 39 Redo NoDetect HPV0168 Negative Redo NoDetectNoDetect NoDetect HPV0169 HPV type 74 Redo NoDetect 24 21 18 17 41 31 2547 HPV0170 HPV type 74 NoDetect NoDetect 23 22 18 17 40 31 24 49 HPV0171HPV type 44 Redo NoDetect 23 22 18 17 42 32 22 48 + 44 32 22 46 HPV0172HPV type 44 NoDetect NoDetect 23 22 18 17 42 32 22 48 HPV179 HPV type 54NoDetect NoDetect NoDetect NoDetect HPV180 HPV type 54 NoDetect NoDetectRedo NoDetect HPV181 No matches NoDetect NoDetect Redo NoDetect HPV182HPV type 44 NoDetect NoDetect 23 22 18 17 42 32 22 48 HPV183 HPV type 5727 20 37 37 37 32 42 Redo 45 34 18 47 59/33 HPV184 HPV type 30 NoDetectNoDetect NoDetect 43 30 21 50 HPV185 HPV type 16 59 27 19 36 NoDetectRedo 43 33 19 49 HPV186 No matches NoDetect NoDetect Redo 40 36 26 42HPV187 No matches NoDetect NoDetect NoDetect NoDetect HPV188 HPV typeNoDetect NoDetect 23 22 18 17 40 32 23 49 11, 13, 74, 44, 55, etc.HPV189 No matches NoDetect 38 40 31 39 NoDetect NoDetect HPV190 NegativeNoDetect NoDetect Redo NoDetect HPV191 HPV type 74 NoDetect NoDetect 2322 18 17 41 31 23 49 HPV192 Negative NoDetect NoDetect NoDetect NoDetectHPV193 Negative NoDetect NoDetect Redo NoDetect HPV194 Negative NoDetectNoDetect Redo NoDetect HPV195 Negative NoDetect NoDetect Redo RedoHPV196 HPV type 6 Redo NoDetect NoDetect 41 32 22 49 HPV197 HPV type 72NoDetect NoDetect NoDetect 41 34 24 45 HPV198 HPV type 70 NoDetectNoDetect NoDetect NoDetect HPV199 HPV type 44 NoDetect NoDetect Redo 4232 22 48 HPV200 Negative NoDetect NoDetect Redo NoDetect HPV201 Nomatches NoDetect NoDetect Redo 39 39 21 45 HPV202 HPV type 6 RedoNoDetect 23 21 20 16 41 32 22 49 HPV203 Negative NoDetect NoDetectNoDetect NoDetect HPV204 Negative NoDetect NoDetect Redo NoDetect HPV205HPV type 39 NoDetect 39 37 34 38 Redo 39 34 28 43 HPV206 NegativeNoDetect NoDetect Redo NoDetect HPV250425 HPV type NoDetect NoDetectNoDetect NoDetect 6vc HPV261931 HPV type 55 Redo NoDetect 23 22 18 17NoDetect HPV340160 HPV type 84 Redo NoDetect NoDetect 36 37 26 45HPV397645 HPV type Redo NoDetect Redo NoDetect cand87 HPV403876 HPV type54 28 16 43 NoDetect Redo NoDetect cand85 HPV525736 HPV type 42 RedoNoDetect Redo NoDetect HPV678087 HPV type 70 Redo NoDetect Redo NoDetectHPV683500 HPV type 40 Redo NoDetect NoDetect 37 35 21 51 HPV711336 HPVtype 44 NoDetect NoDetect 23 22 18 17 42 32 22 48 HPV766300 HPV type 56Redo NoDetect NoDetect NoDetect HPV781687 HPV type 72 Redo NoDetectNoDetect NoDetect HPV857673 HPV type 44 NoDetect NoDetect 23 22 18 17 4232 22 48 HPV901338 Negative NoDetect NoDetect NoDetect NoDetectHPV901338 Negative Redo NoDetect Redo NoDetect HPV922829 NegativeNoDetect NoDetect NoDetect NoDetect HPV922829 Negative Redo NoDetectNoDetect Redo HPV932724 HPV type 54 Redo NoDetect 23 22 18 17 45 31 2345 HPV999950 HPV type 67 56 30 19 36 NoDetect NoDetect NoDetect

Various modifications of the invention, in addition to those describedherein, will be apparent to those skilled in the art from the foregoingdescription. Such modifications are also intended to fall within thescope of the appended claims. Each reference (including, but not limitedto, journal articles, U.S. and non-U.S. patents, patent applicationpublications, international patent application publications, gene bankaccession numbers, interne web sites, and the like) cited in the presentapplication is incorporated herein by reference in its entirety.

1. A composition comprising at least one purified oligonucleotide primerpair that comprises forward and reverse primers, wherein said primerpair comprises nucleic acid sequences that are substantiallycomplementary to nucleic acid sequences of two or more differentbioagents belonging to the HPV family, wherein said primer pair isconfigured to produce amplicons comprising different base compositionsthat correspond to said two or more different bioagents.
 2. Thecomposition of claim 1, wherein said primer pair is configured tohybridize with conserved regions of said two or more different bioagentsand flank variable regions of said two or more different bioagents. 3.The composition of claim 1, wherein said forward and reverse primers areabout 15 to 35 nucleobases in length, and wherein the forward primercomprises at least 70% sequence identity with a sequence selected fromthe group consisting of SEQ ID NOS: 1-8, and the reverse primercomprises at least 70% sequence identity with a sequence selected fromthe group consisting of SEQ ID NOS: 9-16.
 4. The composition of claim 1,wherein said primer pair is selected from the group of primer pairsequences consisting of: SEQ ID NOS: 1:9, 2:10, 3:11, 4:12, 5:13, 6:14,7:15, and 8:16.
 5. The composition of claim 1, wherein said forward andreverse primers are about 15 to 35 nucleobases in length, and wherein:the forward primer comprises at least 70%, sequence identity with thesequence of SEQ ID NO: 1, and the reverse primer comprises at least 70%sequence identity with the sequence of SEQ ID NO: 9; the forward primercomprises at least 70% sequence identity with the sequence of SEQ ID NO:2, and the reverse primer comprises at least 70% sequence identity withthe sequence of SEQ ID NO: 10; the forward primer comprises at least 70%sequence identity with the sequence of SEQ ID NO: 3, and the reverseprimer comprises at least 70% sequence identity with the sequence of SEQID NO: 11; the forward primer comprises at least 70% sequence identitywith the sequence of SEQ ID NO: 4, and the reverse primer comprises atleast 70% sequence identity with the sequence of SEQ ID NO: 12; theforward primer comprises at least 70% sequence identity with thesequence of SEQ ID NO: 5, and the reverse primer comprises at least 70%sequence identity with the sequence of SEQ ID NO: 13; the forward primercomprises at least 70% sequence identity with the sequence of SEQ ID NO:6, and the reverse primer comprises at least 70% sequence identity withthe sequence of SEQ ID NO: 14; the forward primer comprises at least 70%sequence identity with the sequence of SEQ ID NO: 7, and the reverseprimer comprises at least 70% sequence identity with the sequence of SEQID NO: 15; and the forward primer comprises at least 70% sequenceidentity with the sequence of SEQ ID NO: 8, and the reverse primercomprises at least 70% sequence identity with the sequence of SEQ ID NO:16.
 6. The composition of claim 1, wherein said different basecompositions identify said two or more different bioagents at genus,species, or sub-species levels.
 7. The composition of claim 1, whereinsaid two or more amplicons are 45 to 200 nucleobases in length.
 8. A kitcomprising the composition of claim
 1. 9. The composition of claim 1,wherein said different bioagents are selected from the group consistingof HPV-16, HPV-31, HPV-18, HPV-45, HPV-6, and HPV-11.
 10. Thecomposition of claim 1, wherein a non-templated T residue on the 5′-endof said forward and/or reverse primer is removed.
 11. The composition ofclaim 1, wherein said forward and/or reverse primer further comprises anon-templated T residue on the 5′-end.
 12. The composition of claim 1,wherein said forward and/or reverse primer comprises at least onemolecular mass modifying tag.
 13. The composition of claim 1, whereinsaid forward and/or reverse primer comprises at least one modifiednucleobase.
 14. The composition of claim 13, wherein said modifiednucleobase is 5-propynyluracil or 5-propynylcytosine.
 15. Thecomposition of claim 13, wherein said modified nucleobase is a massmodified nucleobase.
 16. The composition of claim 15, wherein said massmodified nucleobase is 5-Iodo-C.
 17. The composition of claim 13,wherein said modified nucleobase is a universal nucleobase.
 18. Thecomposition of claim 17, wherein said universal nucleobase is inosine.19. A composition comprising an isolated primer 15-35 bases in lengthselected from the group consisting of SEQ ID NOs 1-16.
 20. A kit,comprising at least one purified oligonucleotide primer pair thatcomprises forward and reverse primers that are about 20 to 35nucleobases in length, and wherein said forward primer comprises atleast 70% sequence identity with a sequence selected from the groupconsisting of SEQ ID NOS: 1-8, and said reverse primer comprises atleast 70% sequence identity with a sequence selected from the groupconsisting of SEQ ID NOS: 9-16.
 21. A method of determining the presenceof a HPV in at least one sample, the method comprising: (a) amplifyingone or more segments of at least one nucleic acid from said sample usingat least one purified oligonucleotide primer pair that comprises forwardand reverse primers that are about 20 to 35 nucleobases in length, andwherein said forward primer comprises at least 70% sequence identitywith a sequence selected from the group consisting of SEQ ID NOs: 1-8,and said reverse primer comprises at least 70% sequence identity with asequence selected from the group consisting of SEQ ID NOs: 9-16 toproduce at least one amplification product; and (b) detecting saidamplification product, thereby determining said presence of said HPV insaid sample.
 22. The method of claim 21, wherein (a) comprisesamplifying said one or more segments of said at least one nucleic acidfrom at least two samples obtained from different geographical locationsto produce at least two amplification products, and (b) comprisesdetecting said amplification products, thereby tracking an epidemicspread of said HPV.
 23. The method of claim 21, wherein (b) comprisesdetermining an amount of said HPV in said sample.
 24. The method ofclaim 21, wherein (b) comprises detecting a molecular mass of saidamplification product.
 25. The method of claim 21, wherein (b) comprisesdetermining a base composition of said amplification product, whereinsaid base composition identifies the number of A residues, C residues, Tresidues, G residues, U residues, analogs thereof and/or mass tagresidues thereof in said amplification product, whereby said basecomposition indicates the presence of HPV in said sample or identifiessaid HPV in said sample.
 26. The method of claim 25, comprisingcomparing said base composition of said amplification product tocalculated or measured base compositions of amplification products ofone or more known HPV present in a database with the proviso thatsequencing of said amplification product is not used to indicate thepresence of or to identify said HPV, wherein a match between saiddetermined base composition and said calculated or measured basecomposition in said database indicates the presence of or identifiessaid HPV.
 27. A method of identifying one or more HPV bioagents in asample, the method comprising: (a) amplifying two or more segments of anucleic acid from said one or more HPV bioagents in said sample with twoor more oligonucleotide primer pairs to obtain two or more amplificationproducts; (b) determining two or more molecular masses and/or basecompositions of said two or more amplification products; and (c)comparing said two or more molecular masses and/or said basecompositions of said two or more amplification products with knownmolecular masses and/or known base compositions of amplificationproducts of known HPV bioagents produced with said two or more primerpairs to identify said one or more HPV bioagents in said sample.
 28. Themethod of claim 27, comprising identifying said one or more HPVbioagents in said sample using three, four, five, six, seven, eight ormore primer pairs.
 29. The method of claim 27, wherein said one or moreHPV bioagents in said sample cannot be identified using a single primerpair of said two or more primer pairs.
 30. The method of claim 27,comprising obtaining said two or more molecular masses of said two ormore amplification products via mass spectrometry.
 31. The method ofclaim 27, comprising calculating said two or more base compositions fromsaid two or more molecular masses of said two or more amplificationproducts.
 32. The method of claim 27, wherein said HPV bioagents areselected from the group consisting of a Papillomaviridae family, a genusthereof; a species thereof, a sub-species thereof, and combinationsthereof.
 33. The method of claim 27, wherein said two or more primerpairs comprise two or more purified oligonucleotide primer pairs thateach comprise forward and reverse primers that are about 20 to 35nucleobases in length, and wherein said forward primers comprise atleast 70% sequence identity with a sequence selected from the groupconsisting of SEQ ID NOS: 1-8, and said reverse primers comprise atleast 70% sequence identity with a sequence selected from the groupconsisting of SEQ ID. NOS: 9-16 to obtain an amplification product. 34.The method of claim 27, wherein said primer pairs are selected from thegroup of primer pair sequences consisting of: SEQ ID NOS: 1:9, 2:10,3:11, 4:12, 5:13, 6:14, 7:15, and 8:16.
 35. The method of claim 27,wherein said determining said two or more molecular masses and/or basecompositions is conducted without sequencing said two or moreamplification products.
 36. The method of claim 27, wherein said one ormore HPV bioagents in said sample cannot be identified using a singleprimer pair of said two or more primer pairs.
 37. The method of claim27, wherein said one or more HPV bioagents in a sample are identified bycomparing three or more molecular masses and/or base compositions ofthree or more amplification products with a database of known molecularmasses and/or known base compositions of amplification products of knownHPV bioagents produced with said three or more primer pairs.
 38. Themethod of claim 27, wherein said two or more segments of said nucleicacid are amplified from a single gene.
 39. The method of claim 27,wherein said two or more segments of said nucleic acid are amplifiedfrom different genes.
 40. The method of claim 27, wherein members ofsaid primer pairs hybridize to conserved regions of said nucleic acidthat flank a variable region.
 41. The method of claim 40, wherein saidvariable region varies between at least two of said HPV bioagents. 42.The method of claim 40, wherein said variable region uniquely variesbetween at least five of said HPV bioagents.
 43. The method of claim 27,wherein said two or more amplification products obtained in (a) comprisemajor classification and subgroup identifying amplification products.44. The method of claim 43, comprising comparing said molecular massesand/or said base compositions of said two or more amplification productsto calculated or measured molecular masses or base compositions ofamplification products of known HPV bioagents in a database comprisinggenus specific amplification products, species specific amplificationproducts, strain specific amplification products or nucleotidepolymorphism specific amplification products produced with said two ormore oligonucleotide primer pairs, wherein one or more matches betweensaid two or more amplification products and one or more entries in saiddatabase identifies said one or more HPV bioagents, classifies a majorclassification of said one or more HPV bioagents, and/or differentiatesbetween subgroups of known and unknown HPV bioagents in said sample. 45.The method of claim 44, wherein said major classification of said one ormore HPV bioagents comprises a genus or species classification of saidone or more HPV bioagents.
 46. The method of claim 44, wherein saidsubgroups of known and unknown HPV bioagents comprise family, strain andnucleotide variations of said one or more HPV bioagents.
 47. A system,comprising: (a) a mass spectrometer configured to detect one or moremolecular masses of amplicons produced using at least one purifiedoligonucleotide primer pair that comprises forward and reverse primers,wherein said primer pair comprises nucleic acid sequences that aresubstantially complementary to nucleic acid sequences of two or moredifferent HPV bioagents; and (b) a controller operably connected to saidmass spectrometer, said controller configured to correlate saidmolecular masses of said amplicons with one or more HPV bioagentidentities.
 48. The system of claim 47, wherein said HPV bioagentidentities are at genus, species, and/or sub-species levels.
 49. Thesystem of claim 47, wherein said forward and reverse primers are about15 to 35 nucleobases in length, and wherein the forward primer comprisesat least 70% sequence identity with a sequence selected from the groupconsisting of SEQ ID NOS: 1-8, and the reverse primer comprises at least70% sequence identity with a sequence selected from the group consistingof SEQ ID NOS: 9-16.
 50. The system of claim 47, wherein said primerpair is selected from the group of primer pair sequences consisting of:SEQ ID NOS: 1:9, 2:10, 3:11, 4:12, 5:13, 6:14, 7:15, and 8:16.
 51. Thesystem of claim 47, wherein said controller is configured to determinebase compositions of said amplicons from said molecular masses of saidamplicons, which base compositions correspond to said one or more HPVbioagent identities.
 52. The system of claim 47, wherein said controllercomprises or is operably connected to a database of known molecularmasses and/or known base compositions of amplicons of known HPVbioagents produced with the primer pair.
 53. A purified oligonucleotideprimer pair, comprising a forward primer and a reverse primer that eachindependently comprise 14 to 40 consecutive nucleobases selected fromthe primer pair sequences shown in Table 1 and/or Table 2, which primerpair is configured to generate an amplicon between about 50 and 150consecutive nucleobases in length.